Python decomposition.LatentDirichletAllocation类代码示例

OGeek|极客世界-中国程序员成长平台 › 门户 › 编程› Python›Python编程经验

原作者: [db:作者] 来自: [db:来源] 收藏邀请

本文整理汇总了Python中sklearn.decomposition.LatentDirichletAllocation类的典型用法代码示例。如果您正苦于以下问题：Python LatentDirichletAllocation类的具体用法？Python LatentDirichletAllocation怎么用？Python LatentDirichletAllocation使用的例子？那么恭喜您, 这里精选的类代码示例或许可以为您提供帮助。

在下文中一共展示了LatentDirichletAllocation类的20个代码示例，这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞，您的评价将有助于我们的系统推荐出更棒的Python代码示例。

示例1: applyLDA2

    def applyLDA2(self, number_of_clusters, country_specific_tweets):
        train, feature_names = self.extractFeatures(country_specific_tweets,False)
        
        name = "lda"
        if self.results:
            print("Fitting LDA model with tfidf", end= " - ")
        t0 = time()     
        lda = LatentDirichletAllocation(n_topics=number_of_clusters, max_iter=5,
                                        learning_method='online', learning_offset=50.,
                                        random_state=0)

        lda.fit(train)
        
        if self.results:
            print("done in %0.3fs." % (time() - t0))
        
        parameters = lda.get_params()
        topics = lda.components_
        doc_topic = lda.transform(train)
        top10, labels = self.printTopicCluster(topics, doc_topic, feature_names)
        labels = numpy.asarray(labels)
        
        if self.results:
            print("Silhouette Coefficient {0}: {1}".format(name, metrics.silhouette_score(train, labels)))
        
        return name, parameters, top10, labels

开发者ID:michaelprummer，项目名称:datascience，代码行数:26，代码来源:clustering.py

示例2: score_lda

def score_lda(src, dst):
	##read sentence pairs to two lists
	b1 = []
	b2 = []
	lines = 0
	with open(src) as p:
		for i, line in enumerate(p):
			s = line.split('\t')
			b1.append(s[0])
			b2.append(s[1][:-1]) #remove \n
			lines = i + 1

	vectorizer = CountVectorizer()
	vectors=vectorizer.fit_transform(b1 + b2)

	lda = LatentDirichletAllocation(n_topics=n_topics, max_iter=5,
                                learning_method='online', learning_offset=50.,
                                random_state=0)
	X = lda.fit_transform(vectors)
	print X.shape
	b1_v = vectorizer.transform(b1)
	b2_v = vectorizer.transform(b2)
	b1_vecs = lda.transform(b1_v)
	b2_vecs = lda.transform(b2_v)

	res = [round(5*(1 - spatial.distance.cosine(b1_vecs[i], b2_vecs[i])),2) for i in range(lines)]
	with open(dst, 'w') as thefile:
		thefile.write("\n".join(str(i) for i in res))

开发者ID:wintor12，项目名称:SemEval2015，代码行数:28，代码来源:run.py

示例3: plot_perplexity_batch

def plot_perplexity_batch(A_tfidf, num_docs):
    
    print "computing perplexity vs batch size..."
    max_iter = 5
    num_topics = 10
    batch_size = np.logspace(6, 10, 5, base=2).astype(int)
    perplexity = np.zeros((len(batch_size),max_iter))
    em_iter = np.zeros((len(batch_size),max_iter))
    for ii, mini_batch in enumerate(batch_size):
        for jj, sweep in enumerate(range(1,max_iter+1)):
            lda = LatentDirichletAllocation(n_topics = num_topics, max_iter=sweep, learning_method='online', batch_size = mini_batch, random_state=0, n_jobs=-1)
            tic = time()
            lda.fit(A_tfidf)  #online VB
            toc = time()
            print "sweep %d, elapsed time: %.4f sec" %(sweep, toc - tic)
            perplexity[ii,jj] = lda.perplexity(A_tfidf)
            em_iter[ii,jj] = lda.n_batch_iter_
        #end
    #end
    np.save('./data/perplexity.npy', perplexity)
    np.save('./data/em_iter.npy', em_iter)    
    
    f = plt.figure()
    for mb in range(len(batch_size)):
        plt.plot(em_iter[mb,:], perplexity[mb,:], color=np.random.rand(3,), marker='o', lw=2.0, label='mini_batch: '+str(batch_size[mb]))
    plt.title('Perplexity (LDA, online VB)')
    plt.xlabel('EM iter')
    plt.ylabel('Perplexity')
    plt.grid(True)
    plt.legend()
    plt.show()
    f.savefig('./figures/perplexity_batch.png')

开发者ID:vsmolyakov，项目名称:ml，代码行数:32，代码来源:lda_vb.py

示例4: get_features

    def get_features(vocab):
        vectorizer_head = TfidfVectorizer(vocabulary=vocab, use_idf=False, norm='l2')
        X_train_head = vectorizer_head.fit_transform(headlines)

        vectorizer_body = TfidfVectorizer(vocabulary=vocab, use_idf=False, norm='l2')
        X_train_body = vectorizer_body.fit_transform(bodies)

        # calculates n most important topics of the bodies. Each topic contains all words but ordered by importance. The
        # more important topic words a body contains of a certain topic, the higher its value for this topic
        lda_body = LatentDirichletAllocation(n_topics=n_topics, learning_method='online', random_state=0, n_jobs=3)

        print("latent_dirichlet_allocation_cos: fit and transform body")
        t0 = time()
        lda_body_matrix = lda_body.fit_transform(X_train_body)
        print("done in %0.3fs." % (time() - t0))

        print("latent_dirichlet_allocation_cos: transform head")
        # use the lda trained for body topcis on the headlines => if the headlines and bodies share topics
        # their vectors should be similar
        lda_head_matrix = lda_body.transform(X_train_head)

        #print_top_words(lda_body, vectorizer_body.get_feature_names(), 100)

        print('latent_dirichlet_allocation_cos: calculating cosine distance between head and body')
        # calculate cosine distance between the body and head
        X = []
        for i in range(len(lda_head_matrix)):
            X_head_vector = np.array(lda_head_matrix[i]).reshape((1, -1)) #1d array is deprecated
            X_body_vector = np.array(lda_body_matrix[i]).reshape((1, -1))
            cos_dist = cosine_distances(X_head_vector, X_body_vector).flatten()
            X.append(cos_dist.tolist())
        return X

开发者ID:paris5020，项目名称:athene_system，代码行数:32，代码来源:topic_models.py

示例5: plot_perplexity_iter

def plot_perplexity_iter(A_tfidf, num_topics):
    
    print "computing perplexity vs iter..."
    max_iter = 5
    perplexity = []
    em_iter = []
    for sweep in range(1,max_iter+1):
        lda = LatentDirichletAllocation(n_topics = num_topics, max_iter=sweep, learning_method='online', batch_size = 512, random_state=0, n_jobs=-1)    
        tic = time()
        lda.fit(A_tfidf)  #online VB
        toc = time()
        print "sweep %d, elapsed time: %.4f sec" %(sweep, toc - tic)
        perplexity.append(lda.perplexity(A_tfidf))
        em_iter.append(lda.n_batch_iter_)
    #end    
    np.save('./data/perplexity_iter.npy', perplexity)
    
    f = plt.figure()
    plt.plot(em_iter, perplexity, color='b', marker='o', lw=2.0, label='perplexity')
    plt.title('Perplexity (LDA, online VB)')
    plt.xlabel('EM iter')
    plt.ylabel('Perplexity')
    plt.grid(True)
    plt.legend()
    plt.show()
    f.savefig('./figures/perplexity_iter.png')

开发者ID:vsmolyakov，项目名称:ml，代码行数:26，代码来源:lda_vb.py

示例6: LDA

def LDA(tf,word):
    lda = LatentDirichletAllocation(n_topics=30, max_iter=5,
                                learning_method='online',
                                learning_offset=50.,
                                random_state=0)
    lda.fit(tf)
    print_top_words(lda,word,20)

开发者ID:zhangweijiqn，项目名称:testPython，代码行数:7，代码来源:testTFIDFandLDA.py

示例7: lda_tuner

def lda_tuner(ingroup_otu, best_models):

    best_score = -1*np.inf
    dtp_series = [0.0001, 0.001, 0.01, 0.1, 0.2]
    twp_series = [0.0001, 0.001, 0.01, 0.1, 0.2]
    topic_series = [3]
    X = ingroup_otu.values
    eval_counter = 0

    for topics in topic_series: 
        for dtp in dtp_series:
            for twp in twp_series:
                eval_counter +=1
                X_train, X_test = train_test_split(X, test_size=0.5)
                lda = LatentDirichletAllocation(n_topics=topics, 
                                                doc_topic_prior=dtp, 
                                                topic_word_prior=twp, 
                                                learning_method='batch',
                                                random_state=42,
                                                max_iter=20)
                lda.fit(X_train)
                this_score = lda.score(X_test)
                this_perplexity = lda.perplexity(X_test)
                if this_score > best_score:
                    best_score = this_score
                    print "New Max Likelihood: {}".format(best_score)

                print "#{}: n:{}, dtp:{}, twp:{}, score:{}, perp:{}".format(eval_counter, 
                                                                 topics, dtp, twp,
                                                                 this_score, this_perplexity)

                best_models.append({'n': topics, 'dtp': dtp, 'twp': twp,
                                    'score': this_score, 'perp': this_perplexity})
                if (dtp == dtp_series[-1]) and (twp == twp_series[-1]):
                    eval_counter +=1
                    X_train, X_test = train_test_split(X, test_size=0.5)
                    lda = LatentDirichletAllocation(n_topics=topics, 
                                                    doc_topic_prior=1./topics, 
                                                    topic_word_prior=1./topics, 
                                                    learning_method='batch',
                                                    random_state=42,
                                                    max_iter=20)
                    lda.fit(X_train)
                    this_score = lda.score(X_test)
                    this_perplexity = lda.perplexity(X_test)
                    if this_score > best_score:
                        best_score = this_score
                        print "New Max Likelihood: {}".format(best_score)

                    print "#{}: n:{}, dtp:{}, twp:{}, score:{} perp: {}".format(eval_counter, 
                                                                                topics, 
                                                                                (1./topics), 
                                                                                (1./topics),
                                                                                this_score,
                                                                                this_perplexity)

                    best_models.append({'n': topics, 'dtp': (1./topics), 
                                        'twp': (1./topics), 'score': this_score,
                                        'perp': this_perplexity})
    return best_models

开发者ID:karoraw1，项目名称:GLM_Wrapper，代码行数:60，代码来源:otu_ts_support.py

示例8: _get_model_LDA

 def _get_model_LDA(self, corpus):
     #lda = models.LdaModel(corpus, id2word=self.corpus.dictionary, num_topics=5, alpha='auto', eval_every=50)
     lda = LatentDirichletAllocation(n_topics=self.num_of_clusters, max_iter=20,
                                     learning_method='online',
                                     learning_offset=50.,
                                     random_state=1)
     return lda.fit_transform(corpus)

开发者ID:AnastasiaProk，项目名称:ws2018_forum_analyzer，代码行数:7，代码来源:cluster.py

示例9: produceLDATopics

def produceLDATopics():
    '''
    Takes description of each game and uses sklearn's latent dirichlet allocation and count vectorizer
    to extract topics.
    :return: pandas data frame with topic weights for each game (rows) and topic (columns)
    '''
    data_samples, gameNames = create_game_profile_df(game_path)
    tf_vectorizer = CountVectorizer(max_df=0.95, min_df=2, max_features=n_features, stop_words='english')
    tf = tf_vectorizer.fit_transform(data_samples)
    lda = LatentDirichletAllocation(n_topics=n_topics, max_iter=5,
                                    learning_method='online', learning_offset=50.,
                                    random_state=0)
    topics = lda.fit_transform(tf)
    # for i in range(50):
    #     gameTopics = []
    #     for j in range(len(topics[0])):
    #         if topics[i,j] > 1.0/float(n_topics):
    #             gameTopics.append(j)
    #     print gameNames[i], gameTopics
    topicsByGame = pandas.DataFrame(topics)
    topicsByGame.index = gameNames
    print topicsByGame

    tf_feature_names = tf_vectorizer.get_feature_names()
    for topic_idx, topic in enumerate(lda.components_):
        print("Topic #%d:" % topic_idx)
        print(" ".join([tf_feature_names[i]
                        for i in topic.argsort()[:-n_top_words - 1:-1]]))

    return topicsByGame

开发者ID:USF-ML2，项目名称:Steamed_Up，代码行数:30，代码来源:gameRec_getLDAtopics.py

示例10: fit_lda

def fit_lda(tf):
    '''takes in a tf sparse vector and finds the top topics'''
    lda = LatentDirichletAllocation(n_topics=n_topics, max_iter=5, learning_method='online', learning_offset=50., random_state=0)
    lda.fit(tf)
    tf_feature_names = tf_vectorizer.get_feature_names()
    lda_topic_dict = print_top_words(lda, tf_feature_names, n_top_words)
    return lda, lda_topic_dict

开发者ID:scsherm，项目名称:Congress_work，代码行数:7，代码来源:topic_modeling2.py

示例11: basic_lda

def basic_lda(df, n_topics=200, max_df=0.5, min_df=5):
    '''
    Basic LDA model for album recommendations

    Args:
        df: dataframe with Pitchfork reviews
        n_topics: number of lda topics
        max_df: max_df in TfidfVectorizer
        min_df: min_df in TfidfVectorizer
    Returns:
        tfidf: sklearn fitted TfidfVectorizer
        tfidf_trans: sparse matrix with tfidf transformed data
        lda: sklearn fitted LatentDirichletAllocation
        lda_trans: dense array with lda transformed data

    '''

    X = df['review']
    cv = CountVectorizer(stop_words='english',
                         min_df=5,
                         max_df=0.5)
    cv_trans = cv.fit_transform(X)

    lda = LatentDirichletAllocation(n_topics=n_topics, max_iter=7)
    lda_trans = lda.fit_transform(cv_trans)

    return cv, cv_trans, lda, lda_trans

开发者ID:lwoloszy，项目名称:albumpitch，代码行数:27，代码来源:eda.py

示例12: plot_perplexity_topics

def plot_perplexity_topics(A_tfidf):
    
    print "computing perplexity vs K..."
    max_iter = 5    #based on plot_perplexity_iter()
    #num_topics = np.linspace(2,20,5).astype(np.int)
    num_topics = np.logspace(1,2,5).astype(np.int)
    perplexity = []
    em_iter = []
    for k in num_topics:
        lda = LatentDirichletAllocation(n_topics = k, max_iter=max_iter, learning_method='online', batch_size = 512, random_state=0, n_jobs=-1)
        tic = time()
        lda.fit(A_tfidf)  #online VB
        toc = time()
        print "K= %d, elapsed time: %.4f sec" %(k, toc - tic)
        perplexity.append(lda.perplexity(A_tfidf))
        em_iter.append(lda.n_batch_iter_)
    #end
    
    np.save('./data/perplexity_topics.npy', perplexity)
    np.save('./data/perplexity_topics2.npy', num_topics)    
    
    f = plt.figure()
    plt.plot(num_topics, perplexity, color='b', marker='o', lw=2.0, label='perplexity')
    plt.title('Perplexity (LDA, online VB)')
    plt.xlabel('Number of Topics, K')
    plt.ylabel('Perplexity')
    plt.grid(True)
    plt.legend()
    plt.show()
    f.savefig('./figures/perplexity_topics.png')

开发者ID:vsmolyakov，项目名称:ml，代码行数:30，代码来源:lda_vb.py

示例13: extractTopicLDA

def extractTopicLDA(func_message_dic, store_cloumn):
    if len(func_message_dic) == 0:
        print "func_message_dic is null"
        return False
    try:
        conn=MySQLdb.connect(host='192.168.162.122',user='wangyu',passwd='123456',port=3306)
        cur=conn.cursor()
        cur.execute('set names utf8mb4')
        conn.select_db('codeAnalysis')
        for function in func_message_dic:
            message = func_message_dic[function]
            np_extractor = nlp.semantics_extraction.NPExtractor(message)
            text = np_extractor.extract()
            if len(text) == 0:
                continue
            tf_vectorizer = CountVectorizer(max_df=1.0, min_df=1, max_features=n_features, stop_words='english')
            tf = tf_vectorizer.fit_transform(text)
            print("Fitting LDA models with tf features, n_samples=%d and n_features=%d..." % (n_samples, n_features))
            lda = LatentDirichletAllocation(n_topics=n_topics, max_iter=5, learning_method='online', learning_offset=50.,
                                                                    random_state=0)
            lda.fit(tf)
            tf_feature_names = tf_vectorizer.get_feature_names()
            seprator = " "
            for topic_idx, topic in enumerate(lda.components_):
                keywords = seprator.join([tf_feature_names[i] for i in topic.argsort()[:-n_top_words - 1:-1]])
            sql = "update func_semantic set "+store_cloumn+" = '"+keywords+"' where func_name = '"+function+"'"
            print sql
            cur.execute(sql)
            conn.commit()
        cur.close()
        conn.close()
        return True
    except MySQLdb.Error,e:
        print e
        raise

开发者ID:wangyufish，项目名称:code-mine，代码行数:35，代码来源:frequentSubstringMining.py

示例14: topicmodel

def topicmodel( comments ):

    _texts = []
    texts = []

    for c in comments:

        c = c['text']
        _texts.append( c )
        texts.append( c )



    tf_vectorizer = CountVectorizer(
                max_df=.20,
                min_df=10,
                stop_words = stopwords )
    texts = tf_vectorizer.fit_transform( texts )

    ## test between 2 and 20 topics
    topics = {}

    for k in range(2, 10):

        print "Testing", k

        model = LatentDirichletAllocation(
                    n_topics= k ,
                    max_iter=5,
                    learning_method='batch',
                    learning_offset=50.,
                    random_state=0
                )
        model.fit( texts )
        ll = model.score( texts )
        topics[ ll ] = model

    topic = max( topics.keys() )

    ret = collections.defaultdict( list )

    ## ugly, rewrite some day
    model = topics[ topic ]

    ## for debug pront chosen models' names
    feature_names = tf_vectorizer.get_feature_names()
    for topic_idx, topic in enumerate(model.components_):
        print "Topic #%d:" % topic_idx
        print " ".join( [feature_names[i].encode('utf8') for i in topic.argsort()[:-5 - 1:-1]])
        print

    for i, topic in enumerate( model.transform( texts ) ):

        topic = numpy.argmax( topic )
        text = _texts[ i ].encode('utf8')

        ret[ topic ].append( text )

    return ret

开发者ID:matnel，项目名称:hs-comments-visu，代码行数:59，代码来源:main.py

示例15: latdirall

def latdirall(content):
    lda = LatentDirichletAllocation(n_topics=10)
    tf_vectorizer = TfidfVectorizer(max_df=0.99, min_df=1,
                                stop_words='english')
    tf = tf_vectorizer.fit_transform(content)
    lolz = lda.fit_transform(tf)
    tfidf_feature_names = tf_vectorizer.get_feature_names()
    return top_topics(lda, tfidf_feature_names, 10)

开发者ID:nowittynamesleft，项目名称:Machine-Learning，代码行数:8，代码来源:bagofwords.py

示例16: init

class LDATopics:
	# Constructor
	def __init__(self, filename):
		# Member variables
		self.email_data = []
		self.lda = None
		self.feature_names = None
		self.num_topics = NUM_TOPICS
		self.num_words_per_topic = NUM_WORDS_PER_TOPIC
		self.num_features = NUM_FEATURES

		# Load emails from full path to file
		emails = EmailLoader(filename).get_email_dict_array()

		# Process emails into a list of email body contents
		for email_rec in emails:
			if email_rec['body']:
				# Clean the text and add to list
				cleaner = TextCleaner(email_rec['body'])

				self.email_data.append(" ".join(cleaner.tokenize_str()))

	## Public methods ##
	def process(self, topics=None, features=None):
		# Check if default numbers should be used
		if topics is None:
			topics = self.num_topics
			
		if features is None:
			features = self.num_features

		# Calculate term frequency for LDA
		tf_vectorizer = CountVectorizer(max_df=0.95, min_df=2, max_features=features, stop_words='english')
		tf = tf_vectorizer.fit_transform(self.email_data)

		# Fit the LDA model to data samples
		self.lda = LatentDirichletAllocation(n_topics=topics, max_iter=5, learning_method='online', learning_offset=50., random_state=0)

		self.lda.fit(tf)

		# Set the feature name (words)
		self.feature_names = tf_vectorizer.get_feature_names()

	def print_topics(self, words_per_topic=None):
		# Check if default number of words per topics should be used
		if words_per_topic is None:
			words_per_topic = self.num_words_per_topic

		self._print_topics(self.lda, self.feature_names, words_per_topic)

	## Private methods ##
	def _print_topics(self, model, feature_names, words_per_topic):
	    for topic_idx, topic in enumerate(model.components_):
	        print("Topic #%d:" % topic_idx)
	        print(" ".join([feature_names[i]
	                        for i in topic.argsort()[:-words_per_topic - 1:-1]]))

	    print()

开发者ID:jcora-nyt，项目名称:topics_inference，代码行数:58，代码来源:lda_topics.py

示例17: perform_analysis

 def perform_analysis(self, stocks, szTimeAxis, n_ahead):
     # load Snowball comment data
     from agares.datasource.snowball_cmt_loader import SnowballCmtLoader
     SBLoader = SnowballCmtLoader()
     date = self.dt_start.date()
     df_cmt_list = []
     while date <= self.dt_end.date():
         df_cmt_list.append(SBLoader.load(str(date)))
         date += timedelta(days=1)
     df_cmt = pd.concat(df_cmt_list, ignore_index=True)
     # Chinese text segmentation
     self.set_jieba()
     df_cmt['RawComment'] = df_cmt['RawComment'].map(jieba.cut)
     # drop stopwords
     self.stopwords = [line.strip() for line in open('stopwords').readlines()]
     self.stopwords.append(' ')
     df_cmt['RawComment'] = df_cmt['RawComment'].map(self.drop_useless_word)
     cmt = df_cmt['RawComment'].tolist()
     # construct tfidf matrix
     tfidf_vectorizer = TfidfVectorizer(ngram_range=(1, 3), max_df=0.95, min_df=0.05)
     tfidf = tfidf_vectorizer.fit_transform(cmt)
     
     # Fit the NMF model
     n_topics = 5
     n_top_words = 20
     print("Fitting the NMF model with tf-idf features..")
     t0 = time()
     nmf = NMF(n_components=n_topics, random_state=1, alpha=.1, l1_ratio=.5).fit(tfidf)
     print("done in %0.3fs." % (time() - t0))
     print("\nTopics in NMF model:")
     tfidf_feature_names = tfidf_vectorizer.get_feature_names()
     self.print_top_words(nmf, tfidf_feature_names, n_top_words)
     
     # Fit the LDA model
     print("Fitting LDA models with tf-idf features..")
     lda = LatentDirichletAllocation(n_topics=n_topics, max_iter=10,
                                     learning_method='online', learning_offset=50.,
                                     random_state=0)
     t0 = time()
     lda.fit(tfidf)
     print("done in %0.3fs." % (time() - t0))
     print("\nTopics in LDA model:")
     self.print_top_words(lda, tfidf_feature_names, n_top_words)
     
     # load sz daily candlestick data
     sz = next(iter(stocks))
     cst_Day = stocks[sz].cst['1Day'] 
     # print close price within the timescope
     date = self.dt_start
     print()
     print("The ShangHai stock Index (close index) within the timescope")
     while date <= self.dt_end:
         ts = pd.to_datetime(date)
         try:
             print("Date: {0:s}, Index: {1:.2f}".format(str(date.date()), cst_Day.at[ts, 'close']))
         except KeyError: # sz candlestick data does not exist at this datetime
             print("Date: {0:s}, Index: (market closed)".format(str(date.date())))
         date += timedelta(days=1)

开发者ID:ssjatmhmy，项目名称:agares，代码行数:58，代码来源:snowball_comments_analysis_demo.py

示例18: LDA

def LDA(matrix,preserve,n_topics=100):

    lda = LatentDirichletAllocation(n_topics=n_topics, max_iter=10,
                                        learning_method='online', learning_offset=50.,
                                        random_state=randint(1,100))
    lda.fit(matrix[preserve])
    topic_model=lda.transform(matrix)

    return topic_model

开发者ID:azhe825，项目名称:CSC510，代码行数:9，代码来源:func_GUI.py

示例19: test_lda_transform

def test_lda_transform():
    # Test LDA transform.
    # Transform result cannot be negative
    rng = np.random.RandomState(0)
    X = rng.randint(5, size=(20, 10))
    n_topics = 3
    lda = LatentDirichletAllocation(n_topics=n_topics, random_state=rng)
    X_trans = lda.fit_transform(X)
    assert_true((X_trans > 0.0).any())

开发者ID:rsteca，项目名称:scikit-learn，代码行数:9，代码来源:test_online_lda.py

示例20: test_lda_fit_transform

def test_lda_fit_transform(method):
    # Test LDA fit_transform & transform
    # fit_transform and transform result should be the same
    rng = np.random.RandomState(0)
    X = rng.randint(10, size=(50, 20))
    lda = LatentDirichletAllocation(n_components=5, learning_method=method,
                                    random_state=rng)
    X_fit = lda.fit_transform(X)
    X_trans = lda.transform(X)
    assert_array_almost_equal(X_fit, X_trans, 4)

开发者ID:AlexisMignon，项目名称:scikit-learn，代码行数:10，代码来源:test_online_lda.py

注：本文中的sklearn.decomposition.LatentDirichletAllocation类示例由纯净天空整理自Github/MSDocs等源码及文档管理平台，相关代码片段筛选自各路编程大神贡献的开源项目，源码版权归原作者所有，传播和使用请参考对应项目的License；未经允许，请勿转载。

鲜花

握手

雷人

路过

鸡蛋

该文章已有0人参与评论

请发表评论

全部评论

专题导读

More+

10-27 六六分期app的软件客服如何联系？(六六分期

11-06 可心卡盟:win10系统火狐flash插件崩溃怎么

11-06 亲亲特价:怎么删除回收站图标

11-06 济南大学虚拟社区:鲁大师节能降温的具体办

11-06 xlueops.exe:无线网络安装向导

11-06 女斗合众国:win7系统cf与主机连接不稳定怎

11-06 0xc000022-[cf烟雾头]cf怎么调烟雾头

11-06 qizideyouhuo:应用程序无法正常启动0xc0000

11-06 ipz-185:win7系统vcf文件怎么打开

11-06 傻哥蹦迪:win10系统s4怎么打开usb调试

11-06 八神浩树gtaste:回收站清空了怎么恢复

11-06 妖尾之黑色守护:win10系统电脑没有1440x900

11-06 校园至尊魔王小说:win7系统浏览网页时字体

11-06 女斗合众国:win10系统访问共享文件夹提示请

11-06 tokyo hot n0654:恢复win7系统默认字体一招

11-06 雨酷仙境:设置win7系统转移临时文件夹腾出

11-06 阿穆纳伊之杖:win7系统开始菜单在右边还原

11-06 tunespotting:win10系统火狐flash插件总是

11-06 甘尔葛分析师：计谋网站seo关键词暴涨有什

11-06 蔡贵霖: 计谋网站seo关键词暴涨有什么秘密

11-06 博益网首页:ao3网页版进入不了解决方法

11-06 漏斗子专栏: 网站数据分析小白易懂精华篇

11-06 见证双虹怎么做:win7系统开启telnet命令的

11-06 颾狐蝶蜋:系统资源不足无法完成请求的服务

11-06 国光中学校歌:提交网站到alexa查询详细步骤

11-06 西安有情天:静态网页和动态网页的区别

11-06 红木雅尚斋:外部链接构造对网站的好处

11-06 前官礼遇：防止域名劫持–增强域安全性的10

11-06 密传二转答案: 中文分词算法有哪些

11-06 金泉家园邮编:百度快照劫持的表现及应对方

Python decomposition.MiniBatchDictionaryLearning类代码示例发布时间：2022-05-27

Python decomposition.KernelPCA类代码示例发布时间：2022-05-27

Python util.grid_equal函数代码示例

1 Python 入门教程

Python入门教程 Python 是一种解释型、面向对象、动态数据类型的高级程序设计语言。 P

阅读：13807|2022-01-22

2 Python wikiutil.getFrontPage函数代码示例

Python wikiutil.getFrontPage函数代码示例

阅读：10200|2022-05-24

3 Python 简介

Python 简介 Python 是一个高层次的结合了解释性、编译性、互动性和面向对象的脚本

阅读：4091|2022-01-22

4 Python tests.group函数代码示例

Python tests.group函数代码示例

阅读：4043|2022-05-27

5 Python util.check_if_user_has_permission

Python util.check_if_user_has_permission函数代码示例

阅读：3845|2022-05-27

6 Python 操练实例98

Python 练习实例98 Python 100例题目：从键盘输入一个字符串，将小写字母全部转换成大

阅读：3514|2022-01-22

7 Python 环境搭建

Python 环境搭建本章节我们将向大家介绍如何在本地搭建 Python 开发环境。 Py

阅读：3031|2022-01-22

8 Python output.darkgreen函数代码示例

Python output.darkgreen函数代码示例

阅读：2655|2022-05-25

9 Python 基础语法

Python 基础语法 Python 语言与 Perl，C 和 Java 等语言有许多相似之处。但是，也

阅读：2650|2022-01-22

10 Python 中文编码

Python 中文编码前面章节中我们已经学会了如何用 Python 输出 Hello, World!，英文没

阅读：2302|2022-01-22

客服电话

电子邮件

Python decomposition.LatentDirichletAllocation类代码示例

示例1: applyLDA2

示例2: score_lda

示例3: plot_perplexity_batch

示例4: get_features

示例5: plot_perplexity_iter

示例6: LDA

示例7: lda_tuner

示例8: _get_model_LDA

示例9: produceLDATopics

示例10: fit_lda

示例11: basic_lda

示例12: plot_perplexity_topics

示例13: extractTopicLDA

示例14: topicmodel

示例15: latdirall

示例16: __init__

示例17: perform_analysis

示例18: LDA

示例19: test_lda_transform

示例20: test_lda_fit_transform

请发表评论

全部评论

上一篇：

下一篇：

Python util.grid_equal函数代码示例

Python util.get_worker_name函数代码示例

Python util.get_webmention_target函数代

Python util.get_uuid函数代码示例

Python util.get_type_by_name函数代码示例

Python util.grid_equal函数代码示例

Python util.get_worker_name函数代码示例

Python util.get_webmention_target函数代

Python util.get_uuid函数代码示例

Python util.get_type_by_name函数代码示例

Python util.get_stdout函数代码示例

关于我们

产品与服务

解决方案

139-2527-9053

示例16: init