Comparing the methods of measuring multi-rater agreement on an ordinal rating scale: a simulation study with an application to real data

Sertdemir, Yaşar; Burgut, Hüseyin Refik; Alparslan, Zeliha Nazan; Ünal, İlker; Günaştı, Suhan

Comparing the methods of measuring multi-rater agreement on an ordinal rating scale: a simulation study with an application to real data

dc.authorid	0000-0002-8335-1927	en_US
dc.contributor.author	Sertdemir, Yaşar
dc.contributor.author	Burgut, Hüseyin Refik
dc.contributor.author	Alparslan, Zeliha Nazan
dc.contributor.author	Ünal, İlker
dc.contributor.author	Günaştı, Suhan
dc.date.accessioned	2024-07-12T21:04:38Z
dc.date.available	2024-07-12T21:04:38Z
dc.date.issued	2013	en_US
dc.department	Fakülteler, Tıp Fakültesi	en_US
dc.description.abstract	Agreement among raters is an important issue in medicine, as well as in education and psychology. The agreement among two raters on a nominal or ordinal rating scale has been investigated in many articles. The multi-rater case with normally distributed ratings has also been explored at length. However, there is a lack of research on multiple raters using an ordinal rating scale. In this simulation study, several methods were compared with analyze rater agreement. The special case that was focused on was the multi-rater case using a bounded ordinal rating scale. The proposed methods for agreement were compared within different settings. Three main ordinal data simulation settings were used (normal, skewed and shifted data). In addition, the proposed methods were applied to a real data set from dermatology. The simulation results showed that the Kendall’s W and mean gamma highly overestimated the agreement in data sets with shifts in data. ICC4 for bounded data should be avoided in agreement studies with rating scales <5, where this method highly overestimated the simulated agreement. The difference in bias for all methods under study, except the mean gamma and Kendall’s W, decreased as the rating scale increased. The bias of ICC3 was consistent and small for nearly all simulation settings except the low agreement setting in the shifted data set. Researchers should be careful in selecting agreement methods, especially if shifts in ratings between raters exist and may apply more than one method before any conclusions are made.	en_US
dc.identifier.citation	Sertdemir, Y., Burgut, H. R., Alparslan, Z. N., Ünal, I. ve Günaştı, S. (2013). Comparing the methods of measuring multi-rater agreement on an ordinal rating scale: a simulation study with an application to real data. Journal of Applied Statistics. 41(5), s. 1506-1519.	en_US
dc.identifier.endpage	1519	en_US
dc.identifier.issn	1360-0532
dc.identifier.issue	5	en_US
dc.identifier.scopusquality	Q2	en_US
dc.identifier.startpage	1506	en_US
dc.identifier.uri	https://www.tandfonline.com/doi/full/10.1080/02664763.2013.788617
dc.identifier.uri	https://hdl.handle.net/20.500.12415/3796
dc.identifier.volume	41	en_US
dc.institutionauthor	Burgut, Hüseyin Refik
dc.language.iso	en	en_US
dc.publisher	Taylor and Francis Online	en_US
dc.relation.ispartof	Journal of Applied Statistics	en_US
dc.relation.publicationcategory	Uluslararası Hakemli Dergide Makale - Kurum Öğretim Elemanı	en_US
dc.rights	info:eu-repo/semantics/closedAccess	en_US
dc.snmz	KY01794
dc.subject	Agreement	en_US
dc.subject	Multi-rater	en_US
dc.subject	Bounded ordinal scale	en_US
dc.subject	Normal distribution	en_US
dc.subject	Skewed distribution	en_US
dc.title	Comparing the methods of measuring multi-rater agreement on an ordinal rating scale: a simulation study with an application to real data	en_US
dc.type	Article
dspace.entity.type	Publication

Koleksiyon

Tıp Fakültesi Koleksiyonu

Comparing the methods of measuring multi-rater agreement on an ordinal rating scale: a simulation study with an application to real data

Dosyalar

Koleksiyon