Long-range imaging requires effective compensation for the wavefront distortions caused by atmospheric turbulence. These distortions can be characterized by their effect on the point spread function (PSF). Consequently, synthesizing PSFs with the appropriate turbulence properties, for a given set of optics, is critical for modeling and mitigating turbulence. Recent work on sparse and redundant dictionary methods demonstrated three-orders of magnitude reduction in computing time needed to create synthetic PSFs, compared to traditional methods based on a wave propagation approach. The central challenge in harnessing the computational benefit of a dictionary-based approach is careful choice of the dictionary, or set of dictionaries. The choice must adequately capture the range of turbulence conditions and optical parameters present in the desired application or the computational benefits will not be realized. Thus, it is critical to understand the extent to which a dictionary, trained on data with one set of parameters, can be used to synthesize PSFs that represent a different set of experimental conditions. In this work, we examine statistical tests that provide metrics for quantifying the similarity between two sets of PSFs, then we use these results to measure dictionary performance. We show that our measure of dictionary performance is a function of the turbulence conditions and the experimental optics underlying the training data used to create a dictionary. Knowledge of the functional form of the dictionary performance metric allows us to choose the ideal dictionary, or set of dictionaries, to efficiently model a given range of turbulence and optical conditions. We find that choosing dictionary training data with slightly less turbulence than the desired turbulence condition improves similarity between synthetic PSF and experimentally measured PSF.