After connecting the emission lines and the stellar continuums using PCA and the clustering, we just had to build a random number generator, which can reproduce this while mimicking the distributions seen in SDSS. This allowed us to create a realistic galaxy spectral catalogue, where we could switch the photometric errors and the emission lines on and off as we pleased!
The effect of emission lines on the performance of photometric redshift estimation algorithms
Content overview (TL;DR)
Connecting emission lines, which are set by the ionized interstellar material, with the continuum, which describes the stellar populations present in the galaxy, is hard. When one wants to carry this out in a theoretical, they have to solve a problem with a large number of free parameters, which not only has to take into account the various stellar populations present, the distribution of ISM, reddening, and so on, but also the geometry of the galaxy. This is extremely hard, and getting it right requires a lot of tuning.
Hence, doing this on an empirical, data-explorational basis is easier; we just need to capture the variations somehow. But you may ask, how can one connect emission lines and galaxy continua? This is where one can make use of the weak correlation present between the stellar populations and the process exciting the ISM clouds: there are two main processes that can excite them: the UV radiation from young massive stars, hence the star-formation and the other, AGNs. Due to the feedback processes, you cannot have significant contributions from both at the same time; AGNs tend to shut down star formation. This is the weak-correlation one can make use of, and this is what we tried to capture using PCA and clustering.
We found that emission lines induce additional trends in the colours of the galaxies. These trends are not random because the emission lines are not completely random, despite having vastly varying strengths. They are sharp features that may fall into different photometric bands, depending on the redshift. Given that they can carry a significant amount of flux with themselves, they can alter the galaxy colours significantly compared to the continuum-only case! Hence, they can have a substantial effect on the photo-z-s as well!
Surprisingly, having such sharp features will help with the estimation and reduce the error in the photo-z estimation. Given the relatively large variation in lines, this improvement is not so significant. However, they do not make things worse for sure! And what's more important, they can reduce the effects of the photometric errors, especially in the low redshift regime, where the emission lines allow estimators to perform almost as well as if there were no measurement uncertainties!s