Tuesday, April 21, 2015

Readability Indices

It’s widely known that easy and enjoyable reading helps learning and comprehension. Most people prefer reading “plain English,” and tend to turn off when a passage is too difficult to read. So when we speak of a text’s readability score, or of a readability index, what we mean is: How easy is it for readers to understand?

Measuring readability is important for a number of reasons. For one, teachers need to know if their students are capable of writing at their grade level, or whether they need more schooling in a particular area. Readability scores also help teachers and school systems grade whether a textbook is right for their students.  

Writers also need readability scores - especially those who write for children and pre-teens. They’ve definitely helped me make sure I’m writing for the correct audience; plus, they’ve taught me how to increase or decrease my usage of complex sentences depending on my readership. Never write outside your audience!

PaperRater Premium-Only Module: Readability Scores

PaperRater is proud to offer a NEW series of twelve readability scores as part of our premium service. Just sign in, visit our proofreader, enter your text, and receive a full, side-by-side comparison of the most common readability indices.

Why do you offer twelve indices? Shouldn’t I use just one?


Each readability index uses different criteria to create a score. As you’ll see, some use input based on syllables, and others based on word length. Plus, the equations used are slightly different.

So we give our users a set of twelve to eliminate bias and to provide you a wider range of input. That way you can choose for yourself which ones you want to incorporate into your work.

Here’s a quick breakdown of each index we provide:

Automated Readability Index (ARI)


The ARI grades text based on a combination of word and sentence structure. Computers find it difficult to analyze syllables, so the ARI uses a formula based on the number of characters per word, although it’s debatable whether counting characters or syllables is more helpful.





Coleman-Liau Index


Creators Meri Coleman and T. L. Liau constructed this readability score for the Office of Education to standardize textbooks in the United States. Like the ARI, it operates on the assumption that characters per word is a better indicator of readability than syllables. From Wikipedia: “L is the average number of letters per 100 words and S is the average number of sentences per 100 words.”




Dale-Chall Readability Formula


The Dale-Chall Readability Formula uses a different kind of input. Instead of using the number of characters in a word, it calculates the approximate grade level of a text by measuring “hard words.”

What exactly are hard words? The Dale-Chall list contains a list of about 3,000 words known by at least eighty percent of the children in the fifth grade. Words considered difficult are those not listed. The higher the score, the higher the text’s grade level.




Flesch Reading Ease
Unlike the first two indices, the Flesch Reading Ease calculates readability by the average sentence length and the average number of syllables per word. Text is rated on a scale from one to a hundred; the lower the score, the harder the text is to read. Plain English is set at 65, with the average word containing two syllables. The average sentence contains 15 to 20 words.

By the way, the above passage received a score of 73.7 on the Flesch Reading Ease, which means it’s slightly easier to read than plain English.




Flesch-Kincaid Grade Level

The Flesch-Kincaid Grade Level is a companion index to the Flesch Reading Ease, and uses the same inputs (average sentence length + average number of syllables per word). However, the measurements are weighted differently, and so it produces a score as an approximate grade level (e.g. 1-12, or higher).





Fry Readability Formula


Instead of taking a complete word count, the Fry Readability Formula randomly chooses three 100-word samples throughout the text. It then counts the average number of syllables and the average number of sentences per hundred words, and plots them onto a graph. The intersection of the two averages represents the appropriate reading level. It is used widely in the healthcare industries.





Gunning Fog Index

The Gunning Fog Index is calculated by the average length of a sentence and the percentage of complex words. Its inventor, Robert Gunning, complained that then-current writing was too complex (had too much fog) and needed to be simplified. Complex words in this case are described as having three or more syllables. Yet while complex words can be a good indicator of readability, it fails to account for the fact that words with three or more syllables are not necessarily difficult to comprehend.




LIX


LIX, or the Lasbarhetsindex Swedish Readability Formula, was specially designed for the readability of texts in foreign languages. Its formula uses (a) the number of words; (b) the number of periods; and (c) the number of long words (more than six letters). Multicultural teachers prefer its emphasis on long words and average sentence length to predict readability.


\text{LIX} = \frac{A}{B} + \frac{C \cdot 100}{A}




Linsear Write


Similar to the Flesch readability indices, Linsear Write helps calculate a text’s readability by sentence length and the number of “hard words” - words with three or more syllables. It was adopted by the U.S. Air Force to grade the readability of their flight manuals.


Raygor Estimate Graph


A simple readability estimate, the Raygor Estimate allows you to calculate an approximate grade level by taking 100 words from your text and counting the number of sentences. Then you count the number of words within your sample that contain six or more letters. Plot your points on the graph below, and you will receive an approximate U.S. grade level.





SMOG


SMOG is a playful acronym for “Simple Measure of Gobbledygook,” and was developed as a replacement for the Gunning fog index. It is estimated by taking three 10-sentence samples from a piece of text, counting the words with three or more syllables, estimating the square root of the number of words, then adding three.





Spache Readability Formula


This formula was designed for third-grade texts or below, comparing the amount of “unfamiliar words” to the number of words per sentence. “Unfamiliar words,” in this case, are determined to be words that those in third grade or below do not understand. It is recommended to use the Space Readability Formula for those in third grade or below, and to use the Dale-Chall Readability Formula for those in fourth grade or above.



Ready to get started using PaperRater’s premium services, including our twelve readability indices?


Sign up today and receive the following:

  • Longer document uploads (up to 20 pages!)
  • Enhanced plagiarism detection
  • Faster processing
  • NEW readability indices
  • No banner ads
  • File uploads (.doc, .docx, .txt, .odt and .rtf documents)
And remember to try our FREE plagiarism checker and online grammar check for a full report on your use of grammar, spelling, transitional phrases and more!


Monday, April 20, 2015

Free Alternative To TurnItIn

How many times have you needed to check a paper for plagiarism, or for grammar and spelling errors, but struggled to find a quality service? Students and teachers may have recommended the site TurnItIn, but pesky registration and user fees may leave you searching for a free alternative.


Welcome to PaperRater! Unlike TurnItin (and other sites), we require absolutely NO:


  • User fees
  • Logins or registration
  • Downloads of any kind


Just visit our FREE plagiarism checker page, upload your file (or copy and paste your text) and click the “Get Report” button to immediately receive a full plagiarism analysis. We check your paper against more than 10 billion online documents, and will alert you to any potential problems in a detailed report.


Helpful Resources To Improve Your Writing



PaperRater doesn’t just want to help you detect plagiarism - we want you to become a better writer! So we offer a number of FREE tools to help enhance any blog, essay or article.


Teachers love us, too! Not only do we give them a head start on their grading, we help them teach their students how to improve their writing, with:


  • Free grammar checks. Proofread text directly in your browser, with errors underlined in green. You can also edit your text using PaperRater’s free grammar suggestions, then copy and paste the corrections into a new draft.
  • Style check. Learn more about your usage of transitional phrases, and receive help with adverbs, conjunctions and prepositional phrases.
  • Sentence length. A fantastic tool for bloggers and academic writers! Increase your readership by ensuring you have the right sentences for the right audience.
  • Academic vocabulary. PaperRater also grades work based on the quality and quantity of scholarly vocabulary words found in the text. Which is great for seeing how your students measure up against others at their educational level!
  • Free auto-grading. Finally, we present an overall grade of your piece, based on your initial style, grammar, vocabulary and sentence length results. Note that this grade does not account for the meaning of words, the structure of your ideas, or how well you support your arguments. Therefore, you don’t need to take it too seriously. However, our auto-grader is a supremely helpful tool that will help improve your grammar and spelling overall.


Protecting Our Users’ Privacy



In this technological age, we want all our users to feel safe and secure knowing that their work isn’t floating around online somewhere. Nothing is more personal to people than their writing, so we promise to never, ever sell anybody’s information to any third-party.


Plus, unlike TurnItIn, who remains unclear about how long they retain uploaded works in their database, we delete most submitted documents from our servers within 12 hours.


PaperRater: The Free Alternative To TurnItIn



Every day, thousands of people use PaperRater to check for plagiarism, as well as spelling and grammatical errors. Unlike other plagiarism checkers, we require no logins or registration unless you choose to sign up for our premium services.


Additionally, PaperRater caters to both students and teachers. Unlike TurnItIn, we allow and encourage students to check for accidental plagiarism before turning in their papers. And we don’t charge $8 a paper, like our competitors. Yikes!

Analyze your essay today for FREE with PaperRater’s plagiarism, grammar and spelling checker! Become a better writer with instant results on style, grammar, sentence length and more. We aren’t just an alternative to TurnItIn - we’re its replacement!

Thursday, April 9, 2015

Announcing our $1K Giveaway

$1K Giveaway for Educators


Every business has the unenviable task of announcing their services to the world.  In reality this often means tracking down the people most likely to find value in your service and then flashing ads in front of them until they buy your wares. But at PaperRater we wondered if we could forego traditional marketing and come up with ways to enrich the lives of the people that use our services. What better place to start than the classroom?

Today we are announcing our $1K Giveaway, a contest that puts money in teachers' pockets in order to enrich their classrooms.  The idea is simple, tell us how you are using PaperRater in your classroom.  How do you integrate it into your assignments? How have you and your students benefited from it's use? Any stories of how our free plagiarism checker and grammar check have transformed your classroom?

In order to give everyone time to participate (including teachers that are new to PaperRater's services), we are announcing this contest in the Spring of 2015 and we will begin accepting submissions in the Fall of 2015.  A special submission form will be setup at that time and contest rules will be finalized.  At the time of this writing, we are anticipating the following prizes:
  • 1st Place: $500
  • 2nd Place: $300
  • 3rd Place: $200
Please check back in the Fall for the start of the giveaway or signup for our EDU mailing list below:


Thursday, February 5, 2015

Automated Essay Scoring Myths: Part 2

Automated Scoring Myth #2:  Jobs Will Be Lost

This is our second article in this series, so please click here if you missed Part 1.  Note that I will be frequently using the abbreviation AES to refer to Automated Scoring / AI Scoring / Automated Essay Scoring.  Let's begin...

After making the case for the accuracy of Automated Essay Scoring (AES) systems in Part 1, it may seem that the natural consequence of AES would be the loss of jobs.  In particular, teachers and graders (called "readers") may feel vulnerable, as is evidenced by the petition created by a group of readers in 2013.  Each of these roles is worth examining separately since they are quite different in purpose and activities.

Teachers
In Part 1, we discussed the role of AES in high stakes testing, but AES is also an important trend in the classroom. When I tell people about my work on AES, the response that I get is sometimes to the effect of, "So, you are creating technology to replace teachers." This couldn't be further from the truth.  If 10% of a teacher's time is spent grading essays, would this enable us to have 10% fewer teachers?  It's not illogical to jump to that conclusion, but the math doesn't add up when it comes to teaching students how to become excellent writers. The reality is that writing is a craft that takes practice and feedback, just like any other task.  But the time that is required to grade papers causes writing instructors to offer fewer writing assignments with less feedback than is optimal. Enter AES -- a valuable tool that empowers teachers to give more writing assignments and similarly allows students to receive more feedback. AES does not replace the teacher, it's just another tool that the teacher can use. In fact, it may be the best tool!  Some teachers have told us that they mandate usage of PaperRater by their students before the teacher even sets eyes on each paper. PaperRater takes care of checking grammar, spelling, word choice, and more, which frees the teacher up to help each student express themselves with clarity and develop their own distinct flair.

Readers (a.k.a. Graders)
The issue of jobs in regards to readers employed by testing institutions is a bit more opaque.  It is true that when a computer scores a response, then that is one less response that will be scored by a human reader.  But that is not the whole story.  AES systems must be "trained" for each prompt that they are expected to grade, and this process requires that human readers score a number of responses (perhaps 600-2000).  The computer then uses this training set to build a model that it can use to score future responses.  This means that human readers are inextricably tied to the AES technology for each and every prompt.  Because of the expense associated with human readers, writing assignments have been excluded from most standardized tests that students take each year.  But, thanks to AES, this may be changing.  Large groups of school systems in the U.S. and abroad are evaluating AES technology and vendors with the intention of incorporating written assessments (short answer and essay) into standardized testing in a wide variety of subjects from Biology to English Composition.  If successful, this will represent incredible demand for the scoring of written responses by both humans and computers.  Essentially, AES would be "growing the pie", rather than just taking pieces of the pie away from human readers.  So, it's my belief that AES will result in more jobs for human readers, rather than less jobs.  However, I do concede that the future is much less clear in this area.

About PaperRater

As in Part 1, I am including a shameless plug for our free Automated Essay Scoring tool. Students and teachers appreciate the immediate feedback that they receive from PaperRater. You will not find another free tool that offers so many benefits including grammar check, spelling check, analysis of word choice, automated scoring, and plagiarism detection.  We hope you will give it a try!  



Tuesday, December 16, 2014

Automated Essay Scoring Myths: Part 1

In educational institutions across the globe, there is an ongoing debate over the use of Automated Scoring systems.  I use the term "debate" rather loosely, as it seems more like a clattering of voices at times, often from people completely unfamiliar with Automated Scoring.  The most contentious question is whether these systems should be used in the scoring of high stakes tests.  At PaperRater we've sat on the sidelines and watched this discussion unfold, but feel that now might be a good time to add our 2 cents.  Today we are launching a blog series entitled "Automated Essay Scoring Myths".  This series will examine some of the myths surrounding this technology and explain how it works in the process.  We welcome your feedback.

Myth #1:  Computers Can NOT Grade as Well as Humans

This is one that I hear a lot both in print and when talking to people about Automated Scoring.  Just look at what some people are saying:

  • Les Perelman, former writing professor at MIT: "My main concern is that it doesn't work."
  • Mark Shermis, University of Akron researcher: "It can't tell you if you've made a good argument, or if you've made a good conclusion."
  • Diane Ravitch, research professor at NYU: "Computers can’t tell the difference between reasonable prose and bloated nonsense."
  • Petition of Human Readers: "current machine scoring of essays is not defensible, even when procedures pair human and computer raters."
Meanwhile, others are saying things that might suggest the opposite:
  • Mark Shermis, University of Akron researcher: "A few of them [AES systems] actually did better than human raters."
  • Sandra Foster, lead coordinator W. Virginia: "We are confident the scoring is very accurate."
  • Judy Park, Utah Associate Superintendent: "What we have found is the machines score probably more consistently than human scorers."
Which is the correct answer?  Mark Shermis, the researcher quoted in both sections above, offered a study that analyzed results from Automated Essay Scoring competitions sponsored by the Hewlett Foundation in which several AES systems competed against each other.  A public competition later followed, and the results were stunning.  Both private and public systems were able to score at or above the level of humans!  The Mark Shermis study can be found here.

Of greater weight than Shermis' study, are the real-world results that are being seen.  During my time working for an AES vendor, I participated in the development of AES technology that was used as a "2nd reader" for a particular US state.  What this means is that a human reader scored each response and the computer offered a 2nd score.  If the 2 scores were substantially different, then a 3rd human would set the final score.  Our system graded thousands of responses on over a dozen prompts, grade levels, and of varying topics and lengths.  Amazingly, the computer was more accurate than the human readers on EVERY prompt for EVERY grade level.  However, not every project was this successful.  One trial project for a particular country yielded results where the computer was slightly less accurate than the human readers on some traits, although within reasonable measures of error.  Regardless, the message could not be more clear:  Even in it's infancy, Automated Scoring technology is comparable to humans and it is only going to get better.


Why All the Fuss?

From what I've read and the conversations that I've had, the issues and fears about the quality of computer grading stems from two points:

Humans and Computers Grade Differently

Anyone that has proofread a paper has an intuitive feel for how a human grades.  We start from the beginning and read through a paper looking for errors in mechanics.  We breathe in the words and take note of how it makes us feel based on the expressions presented.  We grasp the subtleties (usually) and also take note of how well arguments are made and supported...

Computers share many of the same approaches to grading, but handle some things differently.  While they do scan the paper for errors in grammar and proper usage of appropriate vocabulary, they may estimate other things like logical arguments, by using statistical analyses of the presence of certain word sets, or the similarity of a given response to another response that the computer has already seen graded.  This can make a skeptical audience a little uneasy, but the results show that it works.  If a simpler physics equation can accurately model a complex physical process, would you demand that an exact simulation be used instead?  No.  Similarly, research and real-world usage are showing that AI Scoring systems are every bit as accurate as humans, even if their approach to grading is different.  Furthermore, emulating a human grader has it's drawbacks, as discussed below.  


Humans Do Not Grade as Well as You Think

This one may sting a little bit, but it has to be said.  Let me explain.  You may be picturing a teacher thoughtfully reading over an essay with a red pen -- pausing for a moment to scribble some wise advice in the margins and then continuing on.  This picture is all wrong when it comes to testing.  Graders, called "readers", are given only a few minutes to read an essay and assign a score, usually on a small scale (e.g., 1-6) and they must adhere to a rubric.  Considering these restrictions, they do remarkably well, but we mustn't forget Alexander Pope's poignant observation: "To err is human".  For all the criticism and fears that I've read regarding Automated Scoring systems, I'm amazed at how we hold ourselves in such high-esteem.  I see the same response to the autonomous vehicles being developed.  I get the feeling that people do not realize that machines do not need to be perfect, just better than the comparable human.  Here are just some of the errors that human readers are prone to:

  • Bias.  We are easily influenced by things in the response that should not matter.  For example, something about the writer might remind us of a loved one and that affects the way we score their writing.  That is just one example, but there are many more.
  • Dynamic State = Mixed Output.  We are a complex, chaotic system and this is a nightmare when it comes to scoring.  Computers win when it comes to being rational and consistent.  How is a score affected by a reader who is hungry?  Sad?  Sleepy?  Hungover?  Where is the public outcry over these human "machines" that offer different grades based on their ever-changing internal state?   
  • Drift.  A number of psychologists and behavioral economists have studied the way that humans lack an objective measurement system.  Everything is comparative.  The perceived shade of a color is different when next to a darker vs lighter color.  The length of a line seems shorter if it is next to a longer line.  How well written does an average essay seem after having just read five poorly written essays in a row?
  • Egregious Inconsistency.  The previous two items deal with inconsistency, but this deserves it's own bullet point.  A computer will always give the same output for the same input.  This seems like an obvious and basic prerequisite for any grader; yet, for short answer responses, it is common for human graders to give different scores to the exact same answer.  Let me repeat that, "It is common for human graders to give different scores to the exact same answer."  In fact, I once saw the same answer receive the absolute lowest score from one human reader and the absolute highest score from the other reader.  This seems to me to be the epitome of a poorly designed grading system and yet it is something that is quite common for human readers. 
  • Lack of Precision.  Humans are great at generalizing and connecting relations, but very poor at making calculations in terms of speed and accuracy.  Forcing a human to quickly grade an essay and adhere to a lengthy rubric is simply a mismatch of a human's innate capabilities.  Computers, on the other hand, are quite adept at scanning and processing information, tallying items, counting matches, and making calculations with both speed and accuracy.  This is a key advantage of a computer when a detailed rubric is used and time is limited.  
The point of this section, is not to bash us humans, but to offer a candid view of our flaws, and to help us recognize that combining the different approaches of humans and computers offers us the best path forward.  And this is precisely the approach that Automated Scoring systems are taking.

What About PaperRater?

I would like to end this article with some information on our own FREE Automated Essay Scoring engine affectionately named Grendel.  Grendel is a general scoring system that is not calibrated to specific prompts, such as the systems used in high stakes testing.  It is also designed for speed and limited usage of resources, so the accuracy is below that of a human grader.  Nevertheless, we do plan on offering a more accurate system in the future for premium users.  In the meantime, Grendel offers a general score along with automated feedback on grammar, spelling, word choice, and much more.  We have received hundreds of emails from educators that are using PaperRater to allow their students to receive on-demand feedback before turning their papers in.  My favorite message came from an English teacher who said that PaperRater is the most useful tool that she has used in 25 years of teaching.  We hope you will give it a try!  






Wednesday, July 9, 2014

Even Easier to Use!


CAPTCHA Woes

One of the guiding principles at PaperRater is to make things simple and painless to use.  No signups, no logins, no payments, no three minute wait for results...  We fancied ourselves rather satisfactory in this regard.  That is, until you told us otherwise.  We were shocked to discover that you do not like squinting at images and typing in crooked letters.  And we were at least a small bit saddened when we heard that you do not share our joy in deciphering blurry house numbers.  So, it is with mixed emotions (and sarcasm) that we officially announce the end of reCAPTCHA for most users of our automated proofreading and plagiarism detection tools.

What exactly does this mean? 
A week ago, all users of our site were confronted with the dreaded reCAPTCHA before submitting text into our automated proofreader or our plagiarism checker.  As of today, it has been removed from these tools.  However...this does not mean that CAPTCHA is completely gone:

  • Other parts of the site may still use reCAPTCHA (e.g., contact form)
  • reCAPTCHA may still be displayed if you are suspected of spamming (either by the content you submit, or by the number of submissions coming from your IP address)
  • We may use a less annoying CAPTCHA in the future (one that is not reCAPTCHA), if needed

Why was reCAPTCHA used in the first place?
CAPTCHAs are used to identify a visitor as a human, rather than a bot.  Bots represent a definite problem for our site because 1) we are free, and 2) our services require a lot of computing power.  In other words, bots cost us more money than they do most sites.  Nevertheless, we have plans to defeat the bots w/o forcing most human visitors to enter a CAPTCHA.

Other news in usability
Perhaps not as celebrated as the end of reCAPTCHA...we have also decided to remove the title field.  For most users, we believe this field is unnecessary and just one more obstacle to a quick and painless submission process.

Thanks for reading this far!




Monday, November 4, 2013

Plagiarism Detection Changes

If you've been a regular user of PaperRater, then you may already be aware that we've been struggling with issues in the plagiarism detection module.  We first ran into problems when using a 2nd-tier search API that powered this feature, but we were able to switch to Google and restore service.  This was great for a time, but led to even worse problems as Google accidentally killed our subscription at one point, and, more recently, they have set a very low limit on API requests, which has caused our plagiarism check to have issues later in the day, while working flawlessly in the mornings.
After temporarily disabling the plagiarism check for the past few days, we are rolling it out again today with the Bing Search API powering it under the hood.  We hope this will yield better uptime, but we have already found bugs with their phrase queries, about which we've contacted them.  Feel free to contact us with any feedback regarding this rollout.  Thank you for the patience you've shown as we continue to work through these issues.  And please continue to spread the word about this free resource.  Our team is working hard to deliver a top-notch product that is accessible to all.  But all funds are currently devoted to development and operations, so we need your help to spread the word!  Linking to our website wouldn't hurt either.  :-)   Thanks!

UPDATE  Nov. 10, 2013:  We received a lot of feedback in the few days after this was posted (thank you!).  We responded to this feedback by making further enhancements to the plagiarism detection, which we released near the end of this week.  Results are not optimal, but the dissatisfaction rate has dropped significantly.  We will continue to rollout other enhancements to the plagiarism checker in the weeks ahead that should help address accuracy.