I have a dataset that contains about 17000 of user data scraped from twitter and I am working with the latent dirichlet allocation algorithm. I want to split my dataset but I am not sure what is the best way.
What are the criteria to split a dataset when it comes to train a LDA model.
I am using gensim to train LDA model.
Thank you
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…