Dispersal diagram with correlation between hemoglobin measurements from two data methods in Table 3 and Figure 1. The dotted line is a trend line (the line of the smallest squares) by the observed values, and the correlation coefficient is 0.98. However, statistical methods for evaluating the agreement vary depending on the nature of the variables examined and the number of observers between whom an agreement is sought. These are summarized in Table 2 and explained below. This diagram is adorned with the usual unnecessary regression equation, correlation coefficient and P value. The last is a review of the least credible zero hypothesis that these two measures are not glucose-related. I suggested to Max Bulsara that an equality line would be more informative: measurement of hemoglobin in ten patients with two different methods For ordinal data, where there are more than two categories, it is useful to know whether the evaluations made by different advisors varied by a small degree or a large amount. For example, microbiologists can assess bacterial growth on cultured plaques such as: none, occasional, moderate or confluence. In this case, the assessment of a plate given by two critics as « occasional » or « moderate » would mean a lower degree of disparity than the absence of « growth » or « confluence. » Kappa`s weighted statistic takes this difference into account. It therefore gives a higher value if the evaluators` responses correspond more closely with the maximum scores for perfect match; Conversely, a larger difference in two credit ratings offers a value lower than the weighted kappa. The techniques of assigning weighting to the difference between categories (linear, square) may vary.

Methods used to assess the consistency between observers based on the nature of the variables measured and the number of observers can also be used if the same advisor evaluates the same patients at two times (for example. B 2 weeks apart) or, in the example above, write down the same response sheets after 2 weeks. Its limitations are: (i) it does not take into account the magnitude of the differences, so it is unsuitable for ordinal data, (ii) it cannot be used if there are more than two advisors, and (iii) it does not distinguish between agreement for positive and negative results – which can be important in clinical situations (for example. B misdiagnosing a disease or falsely excluding them can have different consequences). It is often asked whether the measurements of two different observers (sometimes more than two) or two different techniques yield similar results. This is called concordance or condore or reproducibility between measurements. Such an analysis examines the pairs of measurements, either categorically or numerically both, with each pair being performed on a person (or a pathology slide or an X-ray). Diastolic blood pressure varies less between individuals than systolic pressure, so we would expect to see a less good correlation for diastolic pressures if methods are compared in this way.