What Counts as “Good” Writing in High School?

Category : Academic Writing

The question posed for this post is not an easy one to answer. But I want to propose an at least partial one based on some data that I and a colleague (Laura Aull at Wake Forest) have been working on. The data comes from Advanced Placement Literature and Advanced Placement Language exams. Altogether, we have about 400 essays (~300 Language essays and ~100 Literature essays) totaling about 150,000 words.

We digitized the essays and used a tagger to code them for part-of-speech. After tagging the corpus, we ran some statistical analyses on some of the differences between high scoring exams (those receiving a score of 4 or 5) and low scoring exams (those receiving a 1 or 2). I’ve put some of the results in the data visualization below. If you use the mouse to hover over the noun and verb bubbles, you will see their frequencies and their statistical significance.


Okay. So what are we looking at? The parts-of-speech I’ve put on the graph (the preposition of, pronouns, adjectives, articles, verbs, and nouns) are all unequally distributed in high and low scoring exams. In other words, some are more frequent in high scoring exams; some more frequent in lower scoring exams.

The x-axis shows the frequency of the part-of-speech  in high scoring exams, and the y-axis shows the frequency in low scoring exams. The red bubbles are features that distinguish low scoring exams. The blue bubbles are features that distinguish high scoring exams.

For example, nouns are more common in higher scoring exams than in lower scoring ones. By contrast, verbs are more common in lower scoring exams.

That significance is measured by a chi-square test. The greater the significance, the larger the bubble. But be aware that all of the features shown on this graph are statistically significant. So even though the noun bubble looks relatively small, nouns differentiate higher scoring exams from lower scoring ones.

Now that we have a handle on what the graph is showing, what does it mean? What does it tell us about what distinguishes writing that gets rewarded on AP exams from writing that doesn’t?

One really interesting pattern is how these features follow patterns of involved vs. information production that Biber has described. Basically, involved language is less formal and more conversation-like. It is about personal interaction and often related to a shared context. In that kind of discourse, we tend to be less precise; clauses tend to be shorter; and we use lots of pronouns: “That’s cool, am I right?

Information production is very different. It tends to be more carefully planned and, rather than promoting personal interaction, its purpose is to explain and analyze. This is what we often think of when we imagine academic prose: “This juxtaposition of radically different approaches in one book anticipated the coexistence of Naturalism and Realism in American literature of the latter half of the 19th century and the beginning of the 20th century.

Here is Biber’s list of features of the two dimensions. The red highlights are those features that are more common in lower scoring exams, and the blue the more common ones in the higher scoring exams:


The patterns are very clear. The lower scoring exams look much more like involved discourse and the high scoring ones more like informational discourse.

This is not entirely a surprise. Mary Schleppegrell, for example, has argued a number of times how important information management is to successful academic writing for students. What she means is that students need to control how they grammatically put together their ideas.

Let’s take a look at a sample sentence from a high scoring exam:


In the example, the head nouns are all in red. Information that is added before the noun (like the adjective simple) is in green. Information that is added after the noun (like the prepositional phrase of the cake) is in blue. The pronoun they and the predicative adjective simple are in orange.

Information is communicated through nouns, and is elaborated using adjectives, prepositional phrases, and other structures. This is why we see these features to a much greater degree in those higher scoring essays. In the example, the student is able to communicate a lot of information! That information connects specific textual features from the short story (capitalization and simple description) to the student’s reading of a character’s emotional state.

We can also begin to see why there might be more verbs in the lower scoring essays. The subjects and objects of their clauses are less elaborated. In other words, their clauses are shorter. So the frequency of verbs is much higher.

Another interesting result from our research is the relationship between essay length and score. This is an issue that has received some coverage particularly regarding the (soon to be retired) essay portion of the SAT. We found that there is a correlation between essay length and score (0.255), but it is not a particularly strong one. My guess is that this correlation arises not only because the more successful essays have more to say, but also because they say it in a more elaborated way.

If you are interested, you can see all of our data here.