Purpose: This research aimed to develop testing instruments for the ability to think of high level of chemicals that are valid, relay, and have the quality of problem items in the good category. Design/Approach/Methods: The study used a quantitative approach with modifications to the Wilson, Oriondo, and Antonio models. Stratified random sampling techniques and 275 sample students. Polytomous data were analyzed using Partial Credit Models. Findings: The results of the analysis showed validity and reliability by 0.9 and 0.854, with 16 points of questions having a fit category of 79%. The difficulty index of 20 question points was 1 item in the category “Difficult” and 19 points in the category “Medium.” The ability of test takers (Ɵ) was in the range of −2 to +2, which was relatively good. The development of test instruments was qualified to be used to measure the thinking ability of high levels of chemistry. Originality/Value: This study discusses research on the application of Item Response Theory in developing test instruments for higher order thinking skills in chemistry learning. We hope that this study can trigger other research on Item Response Theory analysis and the role of teachers in developing test instruments in measuring students’ abilities in 21st-century learning.