In constructing various tests, researchers perform numerous item analyses for different purposes. As mentioned previously, at the initial stages of test construction, construct validity is a major concern, so that items are analyzed to see if: (a) they tap the characteristic(s) in question, and (b) taken together, the times comprehensively capture qualities of the characteristic being tested. After the items have been designed and written, they will often be administered to a small sample to see if they are understood as the researcher intended, to examine if they can be administered with ease, and to see if any unexpected problems crop up. Often the test will need to be revised.
Now the potentially revised and improved test is administered to the sample of interest, and the difficulty of the items is assessed by noting the number of incorrect and correct responses to individual items. Often the proportion of test takers correctly answering an item will be plotted in relation to their overall test scores. This provides an indication of item difficulty in relation to an individual's ability, knowledge, or particular characteristics. Item analysis procedures are also used to see if any items are biased toward or against certain groups. This is done by identifying those items certain groups of people tend to answer incorrectly.
It should be noted that in test construction, test refinement continues until validity and reliability are adequate for the test's goals. Thus item analysis, validity, or reliability data may prompt the researcher to return to earlier stages of the test design process to further revise the test.