How to Calculate Inter-rater Reliability
The term inter-rater reliability describes the amount of agreement between multiple raters or judges. Using an inter-rater reliability formula provides a consistent way to determine the level of consensus among judges. This allows people to gauge just how reliable both the judges and the ratings that they give are in any given situation. Too little consensus often indicates that the criteria on which the judges base their ratings need to be changed.
Type the ratings into a text document. Each rating should be preceded by a judge identifier such as a judge name or numeral. A comma and a single space should seperate this judge identifier from the rating, and each rating should occupy a new line. If you have more than one rating per judge, these additional ratings should occupy a new line as well. Example: Doug, 6 Doug, 7 Beth, 6 Sam, 9 Ben, 8
Save the text document to your computer.
Go to the Inter-rater Reliability Calculator listed in the Resources.
Scroll to the middle of the page where it says "Specify the text file with ratings" and click "Browse".
Select the text document you saved in Step 2.
Select the number of raters from the scroll down menu next to the browse button.
- The reliability of the ratings should be calculated at somewhere between 0 and 1, with 1 being a high degree of reliability and 0 being no reliability.
- By translating qualitative measurements (e.g., bad, neutral and good) into quantitative ones (e.g., 1, 2 and 3), you can use this same method to measure the inter-rater reliability of non-numerical ratings as well.