 |
|
Home > Resources
>
Community Connections > Community Collaboration
Demonstrating Student Performance in Essential Schools
Sidebars:
The School Inquiry Cycle: Continuous Improvement, Critical Review
Failure by Design: Why Tests Don't Show What Students Can Do
A Consumer's Guide to Those "Standardized" Test Scores
Graduating by Exhibition: One School's Plan
Crash Gordon Takes a Test
State Assessment Systems: A Report Card
Do State Curricula and Tests Work?
One School's Alternative for Recording Student Learning
A 'Multiple Choice' for Parents
Fix the Problem, Not the Blame
How do you recognize a good school? Test scores tell us something, but not enough. We need instead an array of evidence that students are-or are not-learning the things that matter. And the public must join school people to understand and weigh that evidence, then plan together how to improve the work of our students.
A FEW MORE HARD-WON triumphs came this year to Hoover High School, a Coalition member school that serves the poorest students of all San Diego's 161 schools, and half of whose students have limited proficiency in English. After five years of steady change efforts, the school has now more than doubled the percentage of its graduates who go on to post-secondary studies. All students are learning key academic skills through a groundbreaking program that links them to the world of work. And the U. S. Department of Education has named Hoover one of five exemplary urban high schools in the nation.
But that's not what the San Diego school district wants to talk about just now. Based on its test scores and grade point averages, Hoover has just showed up on its list of "low-performing" schools, risking a dramatic district intervention that could dismantle the school, scattering or firing its faculty.
As pretty much any Hoover student, parent, and teacher can show with their own "hard data," this school prepares kids well for the future-despite the fact that 95 percent of its students qualify for free or reduced lunch, and a quarter of them will move in and out of the school during the course of a year.
Hoover students maintain digital portfolios, holding their work against school-wide standards that derive from the Secretary's Commission on Achieving Necessary Skills (Scans). The school's efforts to integrate work-based learning with academics have drawn attention from university admissions officers, private business, educators, and the press. Education Week and Teacher magazine featured Hoover recently in a long, admiring article, and the National Association of Secondary School Principals named its principal, Doris Alvarez, as 1997 Principal of the Year.
The Hoover school community worked out a plan in 1992 for its own improvement. Since then it has kept track of its progress by collecting and analyzing a broad range of data: not only college entry and retention information, but student and teacher portfolios, comparative test scores over time, the records of students that stay at Hoover till graduation, student and parent surveys, and more.
Still, sorted against all the kids in San Diego, the test results here do look bad. The way norm-referenced tests work, somebody has to stand at the end of the line, and the kids from this very poor neighborhood south of Route 8 are those most folks might put there.
So is Hoover succeeding or is it failing?
"One-shot test scores and aggregate grades simply don't measure the kind of performance Hoover students actually accomplish, nor do they come close to measuring the impact of the school on its students," says Rob Riordan, who directs the New Urban High School, a joint project of the U.S. Department of Education and the Big Picture Company.
Ironically, he adds, "our project picked Hoover precisely because its methods of assessment place it firmly in the camp of those who argue for school accountability."
But while this story unfolds, other Essential schools around the country are watching with dismay as more districts unleash similar instances of what Larry Rosenstock, the president of Price Family Charitable Fund in San Diego, has termed "the pedagogy of public humiliation."
They employ one-time, centralized measures that cannot get at certain critical information and that sort students by mean scores. They dictate the fix: curriculum, teaching method, more tests. And the chilling and punitive climate that results, teachers say, strips them of their right and responsibility to exercise professional judgment in class.
The results can be disastrous to student learning. From January onward, one research group discovered this year, Chicago teachers now virtually abandon any instruction that does not directly relate to the city's narrowly prescriptive yearly multiple-choice achievement tests. Though the results can close down their schools, teachers say, the tests do not measure the most important things they teach.
Nor do they measure anything with much accuracy, or even spur kids to learn more. In fact, the most recent report from the Commission on Teaching and America's Future shows that 1990s test-based reforms in Georgia and South Carolina had no effect on student achievement, while states that invested in teaching had large increases in achievement.
But under increasing pressure to look good, school people say, they have learned to rig test results by teaching kids test-taking tricks (answer every question; wrong answers don't affect the score), retaining students in lower grades so their scores will look better, or simply selecting out students who will not score well.
And many schools shy away from even collecting information about student learning patterns. If they reveal their trouble spots, they fear, the state or district will slap them with sanctions and interventions that take away their ability to improve.
A Better Body of Evidence
But collecting and revealing that information is exactly what Essential schools must do, contend Coalition leaders who aim to demonstrate the value and effectiveness of teaching to a different kind of "test"-based on human judgment, and measuring both essential knowledge and inquiring habits of mind.
To balance a political climate that puts its money on the numbers, Essential schools around the country are assembling an array of additional and better evidence-portfolios, public exhibitions, outside reviews by visiting educators, and more-that represents the real work they know their children accomplish.
They are cutting through the accounting-book rhetoric to take a closer look at student work and at the assignments that prompt it. They are charting the many factors that contribute to higher student achievement but are difficult to quantify-parent involvement, teacher reflectiveness, supportive connections with other schools, and more. They are launching long-term efforts to follow how their graduates do in college and the workplace.
And they are opening up the conversation about student performance to their own communities-coming to a deep and common understanding of what their students know and can do, the reasons for that performance, the best possible ways of improving it, and each person's individual responsibilities in doing so.
In the San Francisco Bay Area, former Oceana High School teacher John Larmer works with Kate Jamentz at the Western Assessment Collaborative at WestEd to help schools and districts look closely at their standards and how teachers use them in assessing student work.
"Does the school say it wants kids to reflect on their own learning process and assess their own work?" Larmer asked a workshop group of Essential school teachers at the Coalition' s 1997 Fall Forum. "Let's check out four pieces of student work. Is there evidence that these students know the standards on which they're being assessed? Do they seem to have the habit of monitoring their own progress?"
"We want to see students who are actively engaged in meaningful work, who know important things and can use them, and who can tell us what they're doing and why," Kate Jamentz says. "Five years from now, kids will be talking about their work in terms of quality, not just in terms of 'getting it done.' They'll have the habits of rehearsal and revision, and they'll be able to ask teachers for the help they need."
But to make that happen, teachers have to cultivate "the habit of relentlessness," as Larmer and Jamentz call it-by systematically designing class activities that get at their standards, then "reading" students' work against those standards
-not only to "score" it, but to analyze what further help kids need.
To build and sustain those skills, they assert, school communities must look together at student work-to define common standards, and to examine whether students are getting the help and opportunities they need to reach them.
Examining the Assessments
In the San Mateo Foster City district, Jamentz gathered parents, teachers, administrators, and community members for one intense afternoon to talk about how their district was measuring children's reading skills. Hunkered down intently over a language arts test, their pencils sometimes slippery with nervousness, the clock ticking away as they figured which answers of the multiple choices were likely to be right, these Californians got a close-up look at what tests could show about their kids-and what they could not.
"This is fine as far as it goes," one parent said after several hours of trying out a standardized test, a directed writing assessment, a benchmarked reading task, and more. "But there's a lot it doesn't show. Are you getting any information, for example, on whether my child
likes to read?"
Other schools invite their constituencies to help shape, review and tune school standards in collaboration with teachers. Parents, students, and teachers at the Francis W. Parker Charter Essential School in Devens, Massachusetts gather periodically to sort writing and math problem-solving samples at different levels into piles labeled "good enough," "not good enough," and "better than good enough" for their students.
At intervals, Parker also invites veteran teachers from comparable schools to review a sampling of its year-end portfolio assessments, giving important feedback on how reliable Parker ratings are compared to those of outsiders. Last year, the outsiders' ratings agreed with the insiders' ratings anywhere from 86 to 100 percent of the time, providing important corroboration for any who worried about objectivity in portfolio assessments.
Human Judgments Matter
At Fannie Lou Hamer Freedom High School in the Bronx, 16-year-old Evelyn Abreu stood in front of a graduation committee and in a lilting Spanish accent described her social studies research into the "English-only" movement in this country. "I talked to my own family about their immigration experience," she said. "I followed the Congressional debate and the Supreme Court decision on a case in Arizona. I read the book Hold Your Tongue and an article by Senator Daniel Akaka, and I interviewed someone from the office of Jose Serrano here in New York."
Before she graduates in June, Evelyn must successfully present and defend seven completed subject-area portfolios before a committee made up of two teachers, an outside adult (sometimes family), and an eleventh-grade student.
"It's stressful," she says. "You need to have confidence in yourself. But you gain this by concentrating on your work, and remembering that you have already done what you needed to-now you just need to explain your work." Right now Evelyn is revising a science portfolio presentation on tuberculosis that she tried out last spring for her committee. "They told me what I needed to fix," she says, "and I'm working on it. It's useful, because we learn to use the habits of mind."
Such graduation practices are spreading among Essential schools committed to knowing their students well and demonstrating knowledge and thoughtfulness through public exhibition. Though they may be more subject to human fallibility than a standardized test, advocates say, they demonstrate the very purpose of schooling in a democratic society: to teach and practice human judgment.
"In the name of objectivity and science, the testing industry has led teachers and parents to doubt their own judgments about their children," declares Deborah Meier, who recently founded the new Mission Hill School in Roxbury, Massachusetts.
Personal judgments hold even more power, she argues, because the evidence they depend on comes from people close in to the school's experience. In small schools like Hamer, where students, teachers, and parents know each other well, it's hard to fool them on quality.
"Kids start coming home and talking to parents about what they're doing in school, or asking them about current issues, and a new level of intellectual exchange starts," says Ann Cook, who heads the Urban Academy, a small school that, like Fannie Lou Hamer High School, is affiliated with the Center for Collaborative Education (CCE), a regional Coalition center in New York City. "Most high school parents aren't in schools much, you know. They make judgments about us based on their own close-up looks at student learning."
"Hard" and "standardized" data about student achievement also show up plainly through human observations, Deborah Meier points out. At her new Mission Hill School, teachers tape record all children twice a year to document how they handle written text and how they talk about books and language.
"It's standardized, because all kids are interviewed in a common fashion," she notes. "And it's public. And interested parties can verify or contest its conclusions. Most important, students, parents, and teachers have immediate access to common information-unlike test results, which arrive months later in a format that makes it impossible to tell what kids got wrong or why."
Multiple-choice tests that must remain secret to work properly, Meier points out, provide only indirect data about student performance. Portfolios or exhibitions provide direct data; so does information on what happens to kids after they graduate, which Central Park East Secondary School
(also founded by Meier) has been gathering for several years now.
Assessment as Videotape
Human observations need not displace other information-gathering efforts; in fact, they can inform research, as schools explore the efficacy of their new designs and practices. Many Essential schools work with university-based researchers, investigating their own questions about the central issues of teaching and learning they face. In New York, CCE follows up such "action research" with events where small networks of schools develop, share, and critique their ideas and projects.
But because school change involves changing an entire culture of expectations and habits, its patterns emerge slowly and are often hard to chart. "We need a videotape, not a snapshot, to see the truth about student performance," one Essential school teacher observes.
That videotape may come about when schools and districts begin to work with state offices to limit the extent and stakes of testing, and to develop supports for thoughtful standards-driven assessment by teachers in schools.
In Maine, for instance, the state's new performance-based assessments may soon serve not as a graduation exam, as some states have it, but as a checking measure against locally devised and driven assessments, says David Ruff, who helps direct the Southern Maine Partnership, a regional CES center.
"Now that we have our state "Learning Results' standards in place," he says, "the state is hoping to have local schools take 90 percent of the responsibility for aligning their own curriculum, instruction, and assessments with them." In a perfect world, Ruff observes, what the school knows through its portfolios or other assessments should match kids' scores on the "extended-open-response-item" state tests they take in grades four, eight, and eleven.
"If some scores show drastic differences," he says, "it could reveal a problem at the school level-or it might just mean the student was sick that day. The school gets a chance to defend the child," who might well show very powerful learning in another time or form.
Such ownership of statewide standards by local communities is very powerful, Ruff says. "But it will take a mammoth professional development effort," he cautions, "to come up with a system where local assessments are valid and reliable."
Essential School principles have had a deep and lasting effect on state educational leaders in Maine, spurred by its early membership in the Re:Learning initiative of the Education Commission of the States. The results stand out on the national scene; the National Center for Fair and Open Testing
(FairTest) praises Maine's state assessment system as a "modest and exceptional approach."
But in most other states, FairTest shows, tests are not even based on the new standards states have adopted. And their largely multiple-choice questions typically fail to provide a range of methods for students to demonstrate their learning. (See sidebar)
Moreover, test scores alone reveal more about student demographics than about student performance, critics assert. When Paul Harrington of Northeastern University's Center for Labor Market Studies recently compared statewide student test results in Massachusetts against a list of communities rank-ordered by per-capita income, the ten highest-scoring communities were the ten wealthiest, and the bottom ten in test scores matched nine of the poorest communities.
High Stakes, Low Learning
The picture can get even bleaker when states attach high-stakes outcomes like high-school graduation or teachers' careers to such tests, whether or not they reflect a community's consensus on standards, or correlate to anything important.
In Louisville, teachers bristled recently when the state used test scores to slap an "in crisis" label on the J. Graham Brown School, long regarded as a model of Essential School practice from kindergarten through high school. It dispatched "distinguished educators" to coach teachers on how to improve, but the school climate had turned so tense and defensive that even good advice, observers say, largely fell on deaf ears.
Chicago holds back eighth-graders who score below grade level on the content-driven standardized tests given at every grade level every year. If a school's scores are too low, it risks an "intervention" in which the school board conducts a public hearing, closes the school, and evaluates all the employees-who can lose their jobs without due process protections if they do not pass inspection.
Even if money then flows to a low-performing school to help improve its performance, such a climate can work at cross purposes, points out one Chicago principal.
"When your scores start getting better, the money that's helping you dries right up," she observes dryly. "That doesn't exactly make schools want to come forward with data."
But some Essential schools in states with high-stakes assessments are augmenting or even replacing their state's measures with more authentic-and often more difficult-accountability plans and demonstrations of what their students can do.
Three Essential "pilot schools" in Boston, freed from many district regulations, will substitute for the district's usual Comprehensive School Plan a three-year review cycle in which they identify key areas needing attention, assemble portfolios of evidence, and invite outsiders to review their progress. If it works well, the plan may end up informing district policy.
Ohio makes all students pass a standardized proficiency test to graduate. But juniors at Reynolds-burg High School must also demonstrate their understanding of history through completing a semester-long "junior thesis"course, in which they either carry out a community or political action project or defend a historical thesis centered on an essential question of their choice.
"For logistical reasons, we had to exempt from this requirement those students who were already taking a year-long Advanced Placement
(AP) European history course," says teacher Steve Shapiro, who developed the junior thesis seminar with his colleague Rob Sass. "It's ironic: students find the thesis course so rigorous that our AP students have actually tripled in the year since."
Because they accept only certain strictly labeled "A to F" high school courses for admission, California's selective state universities make it hard for Essential schools to show student learning in courses that cross disciplines or mix students instead of tracking them. But at Homestead High School in Cupertino, Lauri Steel and other teachers have been developing an alternative transcript to document students' learning in more authentic ways.
The new transcript does list courses grade by grade, but the "restructured" interdisciplinary courses do not appear with letter grades beside them. Instead, they show up on an attached "transcript of portfolios," which scores student knowledge, communication skills, work habits, and habits of mind in relation to the school's standards. (See sidebar)
College entry requirements also perpetuate the use of Carnegie units, which chill efforts to cross disciplinary boundaries in curriculum and teacher training and allocation. Some Essential schools have resorted to keeping an odd kind of double books on their transcripts-listing an integrated course called "The Craft of Science" for instance, with the extra "physics" label for any who need their sciences neatly sorted.
In the best of cases-and usually when teachers help create them-new state standards and assessments actually line up well with the bold changes schools are making. Four years ago, Grass Lake High School in Michigan eliminated math tracking and adopted the Core Plus math program, developed at Western Michigan University. All students now take the three-year integrated math program, says teacher Larry Poertner, and despite detracking, the percentage who pass the state's mandatory high school proficiency tests has not dropped.
"The test dovetails nicely with the program, in which kids can approach a problem from any method they want as long as they justify their methods," he says. "I see all our kids becoming more rigorous thinkers, and working together more cooperatively."
Those districts willing to face the expense can also turn to the tough New Standards assessments, created by Lauren Resnick's group at the University of Pittsburgh and the National Center for Education and the Economy. Instead of ranking students as norm-referenced tests do, these aim to determine whether students have learned a set of tasks, concepts, and skills set out in accompanying standards for English language arts, mathematics, science, and "applied learning." New Standards hopes its tests will be flexible enough to assess a range of school-crafted curriculum, though many Essential schools express worry about that. And the tests are quite hard; schools where classes are still driven by old-style tests will also have to scramble to do well on them.
Information Out in the Open
But creating richer assessments only addresses part of a school's responsibility to publicly examine and analyze what they find out about student learning. And it takes courage and trust, Essential school people agree, to have honest conversations-with parents, teachers, and higher-ups-about student performance and how to make it better.
Teachers may know when their students are not meeting the outcomes they set, says Linda Belarde, who heads a small Essential high school on a Zuni reservation in New Mexico. "But it often doesn't feel safe to disclose that to their colleagues and the principal," much less to outsiders.
The more schools make kids' work public, through exhibitions and other community gatherings, the more they build a context in which the public knows students better and can participate in, not just judge, a school's improvement.
"We're aiming for our school to work along the model of an artists colony," says Teri Schrader, who leads the Arts and Humanities faculty at the Parker School. "Teachers can share work in progress, even if it's not yet successful, with the faith that the whole school community will be able to help bring it along."
At a recent parents forum, for example, Schrader asked parents to help think through sensitive issues in an upcoming family history unit. And in twice-yearly conferences with teacher-advisers, Parker parents and students help create "personal learning plans" that spell out their strengths, goals, and strategies for each year.
Trust also grows when school people use plain language to talk about student performance. "I tell parents and kids straight out when the kids cannot read, write, and do the math they need to," says Fran Vandiver, who at Florida's Fort Lauderdale High School began a "prerequisite academy" to bring entering students to the point where they could handle the school's expectations. "Then we have to decide how to address that problem over time, and how to measure and value their progress fairly." Vandiver uses the Test of Adult Basic Education instead of conventional school tests, because "it tells us clearly when kids' reading and math applications are up to speed." Demystifying standardized tests can also clarify the community's goals, Essential schools who have tried it say. For example, when Massachusetts parents were shown samples of items on the new state exams their children will soon have to pass for graduation, even upper-middle-class families with high levels of education-the ones everyone assumed would support the measures-were taken aback. Must all tenth graders, they wondered, know the structure of DNA?
In fact, four very different curricula make up an American public education, Stanford professor of education Larry Cuban has noted: the standardized curriculum that appears in all the textbooks, the curriculum actually taught from day to day by teachers, the tested curriculum-and, finally, the learned curriculum, which students carry with them into the world.
"Essential schools aim for the learned curriculum," declares Larry Rosenstock. "And their standards are actually much higher than the standards that show up in these laundry lists of content."
Meanwhile, at Hoover High School in San Diego, teachers are on the phone tracking down students from the class of 1996-the ones who did so badly on the district's tests that their alma mater is in trouble now. Of the 197 graduates they have reached so far, 129 are in two- or four-year colleges. Five are in trade schools; eight are in the armed services; 49 are working; and the rest are at home.
One of those HHS graduates is Manassa Abraham, who came here with his parents from a starving Somalia a few years ago. Now he is working toward a degree from San Diego State, with a full scholarship from Future Educators of America.
Back at Hoover, Lia Thao-who left Laos a decade ago with her non-literate Hmong family and now lives with 14 relatives in their San Diego apartment-watches his path, and wonders why the district thinks her school is failing. Her own future, she realizes, depends largely on whether Hoover succeeds in helping her learn what she needs to know.
Can the tests adequately measure that knowledge and those skills? Can they reveal what we need to know about Hoover? As Essential school educators take up the challenge to help their communities as well as their students learn in more effective and authentic ways, they persist in the hope that children-not tests-will soon move to the center of the public gaze.
The School Inquiry Cycle: Continuous Improvement, Critical Review
What does an ongoing school self-assessment plan look like when it aims for continuous improvement of teaching and learning? The Southern Maine Partnership, a regional Center of the Coalition, helps its member schools to incorporate documentation, analysis, communication, and constructive debate in the manner of the "School Quality Reviews" being developed by many others:
1. The school community poses key questions on which to focus its self-study (such as, "How does assessment drive our instruction?" or "How does student reflection inform our practice?")
2. The school collects data relevant to its question, through exploring student work, classroom observations, shadowing students through their school day, interviewing or surveying members of the school community, and reviewing documents and policies.
3. The school community engages in dialogues focused on teaching and learning, using tools and processes like the "collaborative school protocol," the "collaborative assessment conference," the "descriptive review" of student work, the "tuning protocol," the "slice" of student work across a designated time period, or others.
4. The school hosts a visiting review team (including teachers from partner schools, university-based educators, administrators, and community members), which spends from two to five days in the school observing, interviewing, and finally reporting its observations to the staff (both orally and in writing)
on the focus question the school has posed and on the Partnership's guiding "learner-centered principles."
(The cost of this visit is usually under $3,000, participants say, including substitute fees and lodging for the visitors.)
5. The school develops ways of reporting out to the community and discussing what actions to take next on the basis of the information gathered.
6. The school and community identify new questions and continue the cycle.
The Southern Maine Partnership is based at the University of Southern Maine, Bailey Hall Room 117, Gorham, ME 04038. Telephone: (207)
780-5669.
|
TOP
Failure by Design: Why Tests Don't Show What Students Can Do
No standardized, norm-referenced test, assert Coalition leaders Theodore R. Sizer and Deborah Meier, can measure the real payoff from serious study-"an examined and useful life," as Sizer puts it-nor can it describe the good school that works to achieve that end.
"Any assessment that correlates poorly with a student's intellectual future offends us, putting stress on teachers and students and taking up their time," Sizer says. "Essential schools, instead, call for evidence that a student not only understands something now, but can use that understanding in a new situation, now and very likely in the future."
A mountain of evidence on how people learn, from two decades of research on cognition, bears out that position. If their goal is real understanding on the part of students, Howard Gardner and others have shown, teachers must use a variety of approaches, allowing students with very different kinds of intelligence to increase their strengths and to struggle productively with areas they find difficult.
The standardized tests in wide use across the country today do not support this kind of teaching-nor, most research shows, do they predict later academic success, except at the extremes. Success later in life, in fact, correlates better with learning experiences that are rarely tested, such as extracurricular activities, data from the National Center for Education Statistics and American College Testing indicate.
"Conventional tests spotlight children who have certain abilities-especially memory and abstract-analytical ones-but leave in the dark children with other kinds of abilities, such as creative and practical ones," says Yale professor Robert Sternberg, the author of Successful Intelligence
(Simon and Schuster, 1996). "They predict only about ten percent of the variation among people in real-world measures of success-the ones that matter most in the life activities for which school is supposed to prepare our children."
If schools instead identify and teach to students' strengths, Sternberg's studies show, kids with practical and creative abilities perform far better than if only their memory and abstract analytical abilities are emphasized and tested.
Early in their lives, he argues, we are derailing from the fast track students who never get the opportunity to show what they can really do-at the same time providing almost limitless opportunities for others who may not have the practical and creative abilities they will need to use their other skills effectively in the real world.
The very design of all norm-referenced tests ensures this outcome, Deborah Meier points out. Because their goal is a bell-shaped curve of scores on which certain students must by definition do poorly, test-makers actually discard items when too many, too few, or the "wrong" subsets of test takers
(such as disadvantaged children) know the answer. Otherwise, she explains, the results would not correlate to all the other norm-referenced tests (from IQ tests on) with which they must agree in order to be valid. "All these tests predict," she says, "is how students will perform on other tests just like them."
Though some new tests attempt to measure students' progress against standards of content and performance, they fall into the same trap the moment they start yielding percentile scores, Meier observes. And if they try to establish and test content that every student must know, they must either eliminate diversity to the point that they are trivial and inunteresting-or individualize exams until they cost far more than schools can pay.
"What if the driving tests were deliberately made up so that 50 percent of all license seekers would not be able to pass?" she asks. "We have an auto industry that would squash such nonsense in a hurry. What industry is making sure that our tests are sensible and fair to all children?"
|
TOP
A Consumer's Guide to Those "Standardized" Test Scores
How much attention should you pay to test scores? "A test score alone offers too little information to make meaning of it," says Paul LeMahieu of the University of Delaware, who also directs research and development for Delaware's education department and has written extensively about the purposes and techniques of different forms of assessment. Before rushing to actions aimed at raising scores, he suggests asking:
1. How does the testing instrument define what it sets out to measure? Nothing is more important. It may be a math test, for instance, but does that mean math computation or the cognitive processes of solving problems? You may care about only part of what is tested, about it all, or about very little of it.
2. What kinds of judgments, decisions, or actions do you need to make? What kind does the test legitimately support? This is the "validity" question: tests are built differently for different purposes. If you want to make decisions about what or how to teach a particular child, don't go to a test designed to determine eligibility for a diploma-use an assessment designed for your purposes.
3. What's the nature of the information provided by the test? Does it refer to whether students know particular material measured by the test questions? (These are criterion-referenced tests.) Or does it quantify how much students know about what is tested compared to other students who took the test?
(These are norm-referenced tests.) Some tests score students as to whether they meet specific desired levels of performance
(standards-referenced tests). Or a test score can chart its taker's own change or growth (over time, for instance, or across categories). Depending on your questions about student learning, one or another of these kinds of information may be right for you.
4. How accurately and well does the test measure the abilities of all kids? Look at the questions it includes-they can greatly influence the kinds of scores it yields for students in particular circumstances. And changing various factors not relevant to the matter being tested-such as extending timing, using plainer language or translating assessment tasks, and accepting oral responses-could yield more trustworthy information.
5. What actions or system of actions, if any, does the test suggest? Only the best of tests provide a level of detail that can signal ways the system could do better by kids. Aim for an assessment system that inspires new conversation among professionals and the community, clarifying the expectations for students and disciplining the understanding of what quality is and what serves as its evidence.
|
TOP
Graduating by Exhibition: One School's Plan
Many Essential high schools have worked out ways to publicly demonstrate their students' readiness for graduation-ways that reflect both their community's own values as to what students should know and be able to do and their belief that depth and thoughtfulness, not coverage, must govern the curriculum. At Anzar High School near Monterey, California, students take junior and senior years to complete six exhibitions, scheduled during a three-day session six times yearly and presented before a jury of staff, students, and community members.
Because the school has adopted the five familiar Essential School "habits of mind" (evidence, perspective, extension, relevance, and reflection)
as touchstones across its curriculum, these serve as the criteria against which these exhibitions (in their written, oral, and question-period aspects)
must pass muster.
(See scoring criteria below.) Students almost never present until they have shown readiness, but on rare instances a presentation is sent back for more work before it passes. The six categories of exhibition are summarized below; students may sometimes combine exhibitions where appropriate.
1. The postgraduate plan, which includes an employment or college portfolio, a physical challenge portfolio, and a self-reflection piece.
2. The history-social studies written and oral exhibition, in which students choose a complex topic of personal interest, draw conclusions and make connections about it based on research, and project patterns from it into the future.
3. The language arts written and oral exhibition, in which students draw on material they have read and written, taking and defending a position that considers many sides of a complex issue.
4. The science exhibition, in which students create, conduct, document, and defend an experiment related to a complex topic and carried out using the scientific method, including written and oral explanations and analysis.
5. The mathematics exhibition, in which students complete a pure or applied project exploring a key mathematical question or conjecture, and provide an in-depth, clear, and correct explanation using at least two different approaches.
6. The service learning exhibition, in which students present their service experiences over the course of high school, reflect on their value, describe plans for future service, and reflect on their responsibilities to society. In addition,
the Spanish or world language component requires students to present orally in Spanish or another world language, and as a continuous and meaningful part, a component of at least one of the above exhibitions. And the arts component requires students to create and incorporate into at least one of the above exhibitions a wholly new work in music, dance, the visual arts, or music.
SCORING CRITERIA
Presents biases of self, others, and the research used.
Indicates understanding of alternative points of view and experiences.
Presents sufficient evidence, including multiple choices available, clearly and convincingly.
Presents an analysis of the deeper implications, including how the student's conclusions might affect the future, what might happen if something changed, and identification of any patterns or connections to other ideas.
Shows a deep understanding of the topic's relevance to self and community.
Includes reflection on what the student learned and what other questions the project brought up.
If appropriate to the topic, shows empathy and explains how doing the project changed the student's thinking.
|
TOP
Crash Gordon Takes a Test
(and the community decides what's fair and credible)
Talking with a lay audience about whether different kinds of student assessments are credible and useful, Kate Jamentz of the Western Assessment Collaborative at WestEd likes to use this little test-taking fiction:
Crash Gordon has been enrolled in Fly-by-Nite Pilot School for three weeks. The school promises that by successfully completing this course, Crash will be ready to pilot 747 commercial jets. Crash has been told that Fly-by-Nite is a highly regarded school. Its ads report that nearly 95 percent of its graduates score above average on the final exam, and its tuition costs are low. Five years ago, Fly-by-Nite replaced its expensive flight simulators with textbooks and films that explain in detail how to fly a plane.
Today is Crash's midterm exam. If he passes with a grade higher than 50 points he can skip the rest of the course and get his license right away. If he scores from 25 to 50 he must repeat the course, and if he doesn't finish or scores below 25, he will be kicked out. The stakes are high.
Crash opens the exam and finds 100 multiple-choice questions that he needs to answer in 30 minutes. One third of them are about the parts of a plane; another third ask about how to read a flight schedule; the final third cover the dress code for pilots. None of these topics has been covered in the course. About half the questions in each section are in French or Spanish-because 747s usually fly international routes, Crash figures. The last questions asks for a brief written answer to the question, "Who has most influenced your life as an aviator? Explain."
Crash is not daunted. He realizes that he has to answer quickly, so he fills in the answers according to a pattern: every fourth question will be "A," every third question "B," and so on. His teacher, Mr. Soar, will be scoring the exam, so Crash uses the essay question to explain how much Soar's teaching has meant to his career. He finishes the exam with three minutes to spare.
When the exam scores are posted, Crash is elated: his strategy has paid off. He got 29 points, two points above the average score, which was 27. He leaves eager to improve his score next time and become a pilot-so he goes to the airport and sits near the crew lounge, where he knows he can learn more about the pilot dress code.
"How comfortable will you be flying with Crash if he gets his pilot's license next time?" Jamentz asks the group. "Why? What does this story suggest about assessment information that we trust and can use?" Participants go on to hash out the qualities that make an assessment valid, fair, credible, and useful, then hold up their district's measures against those definitions. "It's a way to quickly launch a rich and concrete discussion," Jamentz notes, "in terms people can easily relate to."
The Western Assessment Collaborative at WestEd regional educational laboratory is at 730 Harrison Street, San Francisco, CA 94107; (415) 241-2704.
|
TOP
State Assessment Systems: A Report Card
One-third of state public school testing systems need a complete overhaul and another third need major improvements if they are to provide support for high quality teaching and learning, according to a new study by the National Center for Fair and Open Testing (FairTest), which evaluated assessment practices in all 50 states against standards endorsed by more than 80 organizations and scores of prominent education reformers. "While a small number have made significant progress," says Monty Neill, FairTest's associate director, "most are just tinkering at the edges of reform" on five key standards:
- Assessment supports important student learning.
- Assessments are fair.
- Educators receive adequate professional development in assessment.
- Systems are in place for public information, reporting, and ensuring parents' rights.
- Assessment systems are regularly reviewed and improved.
Most states do not do a good job of including students with special needs or those with limited English proficiency in state assessments by making appropriate accommodations or administering alternative tests, FairTest concluded. Teacher training in assessment remains weak in most states. And few states do a good job of evaluating the impact of their testing programs on classroom teaching and learning.
The states with top rankings in the FairTest survey tend to rely on multiple measures of achievement, including strong use of performance assessments or portfolios, and do not make high-stakes decisions based on the results of any one exam. Some rely on sampling for school and district accountability rather than testing every child.
Vermont, Maine and Kentucky primarily rely on extended, constructed-response, performance and portfolio assessments, the study found, and Maryland, Connecticut, Rhode Island and Colorado are among those that make "substantial use" of such methods.
Also on the positive side, the report found that 38 states now use writing samples, although 34 simply require students to respond to a prompt, thus fostering and evaluating a limited conception of writing. Most states also pay substantial attention to bias reduction in designing their assessments.
"Testing Our Children: A Report Card on State Assessment Systems" may be ordered from FairTest, 342 Broadway, Cambridge, MA 02139 (tel.: 617-864-4810; fax 617-497-2224. The executive summary and all the individual state reports are also on the FairTest web site: http://www.FairTest.org.
|
TOP
Do State Curricula and Tests Work? A Case History from the New York Regents
Those interested in what effects state curriculum standards and testing can have on student learning might study the history of the New York State Regents examination system, a long-standing example of a state-mandated curriculum and testing program. Though it is now undergoing a major revision toward a more performance-oriented model, the Regents system for years dictated a wide range of courses and content syllabi for high school students on two levels. The higher level, taken by about 35 percent of students, included at least eleven tests for the Regents diploma; students deemed less able had to pass at least six lower-level competency exams to receive a local diploma.
Though the exams were connected to a required curriculum, few demanded critical thinking, analysis, extended writing, or other performances, Linda Darling-Hammond observes in her recent book The Right to Learn
(Jossey-Bass, 1997). "Students can complete the Regents curriculum without ever writing a paper of more than a few pages, reading a primary source in history, doing a research project, or designing a single science experiment," she writes. "The curriculum is so tightly prescribed that there is little room for addressing student needs or more ambitious learning goals."
Moreover, Darling-Hammond cites research showing that teaching to the rote-oriented state tests undermined the quality of teaching and learning in mathematics and science classrooms. "Although many students fail the tests," she writes, "those who pass do not appear to learn more effectively than students had before the tests were installed." As their performance on independent measures dropped, she notes, New York's schools grew more unequal in the opportunities and resources they offered students as well. "The theory that high-stakes testing would drive systemwide reforms," she concludes, "did not pan out."
|
TOP
One School's Alternative for Recording Student Learning
Like several other schools in California's "Transitions" project, Homestead High School in Cupertino is developing an alternative transcript that more accurately reflects the school's interdisciplinary courses, project-based learning, and performance-based assessments. Homestead plans to accompany the above with a list of interdisciplinary programs and courses and summaries of the school's expectations, performance standards, rubrics, and other relevant data. In addition, a conventional "course transcript"describes credits and grades for more traditional courses at the school. Below is a recent excerpt from a draft under discussion:
TRANSCRIPT OF PORTFOLIOS
Student performance in integrated courses is assessed in terms of the Habits of Mind, Knowledge base, Communication Skills, and Habits of Work demonstrated in student portfolios and portfolio presentations. Please refer to the accompanying documentation for descriptions of the curriculum covered in each course as well as the schoolwide standards for Habits of Mind, Communication skills, and Habits of Work. Student performance is assessed relative to these curriculum and skill standards: exceeds standards, proficient with regard to standard, on track to meeting standard (for senior progress check), advancing toward meeting standard, or needs to develop knowledge/skill in this area.
|
TOP
A Multiple Choice for Parents: How do you want your child's learning to be measured?
When parents get to leaf through the test items by which their children are sorted and ranked against each other-
or when they sit down and endure an hour or two of taking the actual tests their children take-many are struck by how ambiguous the questions are, and how trivial and arbitrary as a summary of learning. What follows is a sampling of such items, followed, for contrast, with two exhibition questions from Essential schools, and finally with a question intended to provoke further thought about what we teach and what we measure:
Was the infantry invasion of Japan a viable alternative to the use of the atomic bomb to end World War II? If so, why? If not, why not?
A. Yes; transport ships were available in sufficient numbers. B. Yes; island defenses in Japan were minimal. C. No; estimated casualties would have been much greater. D. No; Japan was on the verge of having an atomic bomb.
[Released by an association of test publishers as an example of measuring "higher order thinking"That is, analysis, synthesis, evaluation]
To prosper is to A. thrive B. wander C. seek D. adapt
Decide if one of the underlined words is spelled wrong or if there is no mistake:
A. A beray is a French cap. B. Little rain falls in arid regions. C. The squirrel ran across the grassy knoll. D. No mistake
[Adapted from the eighth-grade level of a recently revised test widely used by districts]
The famous author won the ______________
several times.
A. pulitzer Prize B. pulitzer prize C. Pulitzer prize D. Pulitzer Prize
Which number sentence means, "Four times a number is thirty-six minus eight"?
A. n = 4(36 - 8)
B. 4n - 8 = 36 C. 4n - 36 = 8 D. 4 x 8n = 36 E. not here
Hal deposited $500 in a savings account that pays 8% interest per year, payable every six months. How much will Hal have in his account at the end of six months?
A. $270 B. $540 C.
$530 D. $520 E. not here
Randy is a student in ______________
.
A. high school B. High school C. High School D. high School
[Adapted from a tenth-grade test by the same publisher]
An architect's most important tools are his
A. pencil and paper B. buildings C. ideas D. bricks
[Elementary reading item cited by Deborah Meier in "Why Reading Tests Don't Test Reading," Dissent, Fall 1981, pp. 457-66]
The current turmoil between the U.S. and Iraq is one more in a series of foreign policy challenges in United States history. Present the committee with three examples of other crises-one from the eighteenth century, one from the nineteenth, and one from the twentieth-that had a major impact on national or international events in its historical era. Also identify three literary or other artistic works that in some way derive from or comment on each crisis, and explain how they do so. Finally, explain how developments in science or technology in that time contributed to either heightening or defusing the crisis.
[From an Essential school's year-end 10th grade exhibition before a panel of teachers, peers, and community residents. Scored on content knowledge, communication skills, and critical "habits of mind"]
Does ethnicity and education status among City Heights women create a barrier around knowledge about breast cancer early warning signs and prevention options?
[From the Senior Project essential question framed by Hai Pham, a student at Hoover High School in San Diego]
Familiarity with Henry IV, Part II is likely to be of great importance in:
A. planning a corporate takeover B evaluating budget cuts at the Department of Education C. initiating a medical liability suit D. writing an impressive job resume E. taking a test on "What Do Our Seventeen-Year-Olds Know."
[From "What Do Our Forty-Seven-Year-Olds Know?" in Benjamin Barber, An Aristocracy of Everyone (Oxford University Press, 1992)]
|
TOP
Fix the Problem, Not the Blame: Engaging the Public in School Accountability
Improving "accountability" by merely adopting new and enlightened assessments like portfolios and exhibitions will not go far, Paul LeMahieu contends in a number of recent published articles. Instead, schools and communities must come together in events that promote the honest disclosure and weighing of good evidence, planning, and action. The "accountability events" he urges are a process of continuous and public engagement, he says-a way to "fix the problem, not the blame." He suggests the following guidelines:
- The data under discussion must be understandable and intuitively meaningful to all participants.
- The information must be enriched by descriptions of the context, conditions, and educational practices that gave rise to it.
- The discussion should include all those who need to understand how schools perform, and whose contributions schools need in order to improve.
- The effort will depend on developing the capacity for public relations outreach; for data collection, analysis, and use; and for effective communication and group process skills, including conflict resolution.
|
TOP
Price: $5
Code: H14:2
To order a hard copy of this resource you will need the title, price, and code to fill out your order form.
This resource last updated: June 06, 2002
Database Information:
|
Source: Horace. Vol. 14, #2. Nov. 1997.
Publication Year: 1997
Publisher: CES National
Type: Horace Feature, Horace Sidebar
School Level: All
Issue: 14.2
Focus Area: School Design, Community Connections
STRAND: Community Connections: community collaboration
Data Collection and Analysis: Cycle of Inquiry, Teacher Research
Community Collaboration: Accountability
|
|
|