Probabilistically split concatenated words using NLP based on English Wikipedia uni-gram frequencies.