Multimodal distributions are commonly seen when analyzing composite variables such as insurance claims, where some large proportion are zero, but then the proportion of non-zero values take on a distribution of their own. Breaking down these sorts of distributions into their component parts allows us to more effetively model each piece and then recombine them at a later stage.
OneHotEncodersets the parameter
drop=Nonewhich in turn causes it to output $ k $ output columns. When then used to fit a linear model with intercept, this results in a situation where we have perfect multicollinearity, and so the model overfits the data using unrealistic coefficients.