To build capacity in medicines management, the Uganda Ministry of Health introduced a nationwide supervision, performance assessment and recognition strategy (SPARS) in 2012. Medicines management supervisors (MMS) assess performance using 25 indicators to identify problems, focus supervision, and monitor improvement in medicines stock and storage management, ordering and reporting, and prescribing and dispensing. Although the indicators are well-recognized and used internationally, little was known about the reliability of these indicators. An initial assessment of inter-rater reliability (IRR), which measures agreement among raters (i.e., MMS), showed poor IRR; subsequently, we implemented efforts to improve IRR. The aim of this study was to assess IRR for SPARS indicators at two subsequent time points to determine whether IRR increased following efforts to improve reproducibility. Initially only five (21%) indicators had acceptable reproducibility, defined as an IRR score ≥ 75%. At the initial assessment, prescribing quality indicators had the lowest and stock management indicators had the highest IRR. By the third IRR assessment, 12 (50%) indicators had acceptable reproducibility, and the overall IRR score improved from 57% to 72%. The IRR of simple indicators was consistently higher than that of complex indicators in the three assessment periods. We found no correlation between IRR scores and MMS experience or professional background.