• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

Python covariance.EllipticEnvelope类代码示例

原作者: [db:作者] 来自: [db:来源] 收藏 邀请

本文整理汇总了Python中sklearn.covariance.EllipticEnvelope的典型用法代码示例。如果您正苦于以下问题:Python EllipticEnvelope类的具体用法?Python EllipticEnvelope怎么用?Python EllipticEnvelope使用的例子?那么恭喜您, 这里精选的类代码示例或许可以为您提供帮助。



在下文中一共展示了EllipticEnvelope类的20个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于我们的系统推荐出更棒的Python代码示例。

示例1: filter_remove_outlayers

    def filter_remove_outlayers(self, flat, minimum_value=0):
        """
        Remove outlayers using ellicptic envelope from scikits learn
        :param flat:
        :param minimum_value:
        :return:
        """
        from sklearn.covariance import EllipticEnvelope
        flat0 = flat.copy()
        flat0[np.isnan(flat)] = 0
        x,y = np.nonzero(flat0)
        # print np.prod(flat.shape)
        # print len(y)

        z = flat[(x,y)]

        data = np.asarray([x,y,z]).T

        clf = EllipticEnvelope(contamination=.1)
        clf.fit(data)
        y_pred = clf.decision_function(data)


        out_inds = y_pred < minimum_value
        flat[(x[out_inds], y[out_inds])] = np.NaN
        return flat
开发者ID:andrlikjirka,项目名称:lisa,代码行数:26,代码来源:body_navigation.py


示例2: clean_series

    def clean_series(self, token, discard=5):

        """
        Remove outliers from the ratio series for a token.

        Args:
            discard (int): Drop the most outlying X% of the data.

        Returns: OrderedDict{year: wpm}
        """

        series = self.ratios[token]

        X = np.array(list(series.values()))[:, np.newaxis]

        env = EllipticEnvelope()
        env.fit(X)

        # Score each data point.
        y_pred = env.decision_function(X).ravel()

        # Get the discard threshold.
        threshold = stats.scoreatpercentile(y_pred, discard)

        return OrderedDict([
            (year, ratio)
            for (year, ratio), pred in zip(series.items(), y_pred)
            if pred > threshold
        ])
开发者ID:davidmcclure,项目名称:history-of-literature,代码行数:29,代码来源:wpm_ratios.py


示例3: outlier_removal2

def outlier_removal2(features, samples, cv_predict):

    outliers_fraction = 0.1

    print cv_predict.shape
    print samples.shape
    test = np.column_stack((cv_predict, samples))
    #clf = EllipticEnvelope(contamination=.1)
    clf = EllipticEnvelope(contamination=.1)
    #clf = svm.OneClassSVM(nu=0.95 * outliers_fraction + 0.05,
    #                                 kernel="rbf", gamma=0.1)
    clf.fit(test)
    y_pred = clf.decision_function(test).ravel()
    threshold = stats.scoreatpercentile(y_pred,
                                        100 * outliers_fraction)

    y_pred_new = y_pred > threshold
    print y_pred_new
    #print samples[y_pred_new]
    print samples.shape
    print samples[y_pred_new].shape
    print features.shape
    print features[y_pred_new].shape

    return features[y_pred_new], samples[y_pred_new]
开发者ID:openforis,项目名称:opensarkit,代码行数:25,代码来源:ost_rf_regressor.backup.py


示例4: calc

    def calc(self,outliers_fraction):
        

        data, dqs, raw = self.get_data()
        clf = EllipticEnvelope(contamination=outliers_fraction)
        X = zip(data['Tbandwidth'],data['Tlatency'],data['Tframerate'])
        clf.fit(X)
        #data['y_pred'] = clf.decision_function(X).ravel()
        #data['y_pred'] = clf.decision_function(X).ravel()
        
        #threshold = np.percentile(data['y_pred'],100 * outliers_fraction)
        data['MDist']=clf.mahalanobis(X)
        
        #picking "bad" outliers, not good ones
        outliers = chi2_outliers(data, [.8,.9,.95], 3)
        #print outliers
        outliers = [i[i['Tbandwidth']<i['Tlatency']] for i in outliers]
        
        #outliers = data[data['y_pred']<threshold]
        #data['y_pred'] = data['y_pred'] > threshold
        #outliers = [x[['ticketid','MDist']].merge(raw, how='inner').drop_duplicates() for x in outliers]
        #print raw
        #outliers = [raw[raw['ticketid'].isin(j['ticketid'])] for j in outliers]
        outliers = [k[k['Tframerate']<(k['Tframerate'].mean()+k['Tframerate'].std())] for k in outliers] #making sure we don't remove aberrantly good framrates
        outliers = [t.sort_values(by='MDist', ascending=False).drop_duplicates().drop(['Tbandwidth','Tlatency','Tframerate'],axis=1) for t in outliers]
        
        #dqs = raw[raw['ticketid'].isin(dqs['ticketid'])]
        #data = data.sort_values('MDist', ascending=False).drop_duplicates()
        
        return outliers, dqs, data.sort_values(by='MDist', ascending=False).drop_duplicates().drop(['Tbandwidth','Tlatency','Tframerate'],axis=1)
开发者ID:justinstuck,项目名称:Frame,代码行数:30,代码来源:outlierDetectionExample.py


示例5: model_2_determine_test_data_similarity

 def model_2_determine_test_data_similarity(self,model):
     clf_EE={}
     model_EE={}
     for i in range(len(model)):
         clf=EllipticEnvelope(contamination=0.01,support_fraction=1)
         clf_EE[i]=clf
         EEmodel=clf.fit(model[i])
         model_EE[i]=EEmodel
     return clf_EE,model_EE
开发者ID:sandialabs,项目名称:BioCompoundML,代码行数:9,代码来源:cluster.py


示例6: plot

def plot(X, y):
    proj = TSNE().fit_transform(X)
    e = EllipticEnvelope(assume_centered=True, contamination=.25) # Outlier detection
    e.fit(X)

    good = np.where(e.predict(X) == 1)
    X = X[good]
    y = y[good]

    scatter(proj, y)
开发者ID:vortext,项目名称:DeepOntology,代码行数:10,代码来源:experiment.py


示例7: filterOut

def filterOut(x):
    x = np.array(x)
    outliers_fraction=0.05
    #clf = svm.OneClassSVM(nu=0.95 * outliers_fraction + 0.05,  kernel="rbf", gamma=0.1) 
    clf = EllipticEnvelope(contamination=outliers_fraction)
    clf.fit(x)
    y_pred = clf.decision_function(x).ravel()
    threshold = stats.scoreatpercentile(y_pred,
                                        100 * outliers_fraction)
    y_pred = y_pred > threshold
    return y_pred
开发者ID:ranBernstein,项目名称:LDA_Syntetic,代码行数:11,代码来源:main.py


示例8: module4

    def module4(self):
        '''
            入力された一次元配列からanomaly detectionを用いて外れ値を検出する
        '''

        # get data
        img = cv2.imread('../saliency_detection/image/pearl.png')
        b,g,r = cv2.split(img) 
        B,G,R = map(lambda x,y,z: x*1. - (y*1. + z*1.)/2., [b,g,r],[r,r,g],[g,b,b])

        Y = (r*1. + g*1.)/2. - np.abs(r*1. - g*1.)/2. - b*1.
        # 負の部分は0にする
        R[R<0] = 0
        G[G<0] = 0
        B[B<0] = 0
        Y[Y<0] = 0
        rg = cv2.absdiff(R,G)
        by = cv2.absdiff(B,Y)
        img1 = rg
        img2 = by

        rg, by = map(lambda x:x.reshape((len(b[0])*len(b[:,0]),1)),[rg,by])
        data = np.hstack((rg,by))
        data = data.astype(np.float64)
        data = np.delete(data, range( 0,len(data[:,0]),2),0)

        # grid
        xx1, yy1 = np.meshgrid(np.linspace(-10, 300, 500), np.linspace(-10, 300, 500))
        
        # 学習して境界を求める # contamination大きくすると円は小さく
        clf = EllipticEnvelope(support_fraction=1, contamination=0.01)
        print 'data.shape =>',data.shape
        print 'learning...'
        clf.fit(data) #学習 # 0があるとだめっぽいかも
        print 'complete learning!'

        # 学習した分類器に基づいてデータを分類して楕円を描画
        z1 = clf.decision_function(np.c_[xx1.ravel(), yy1.ravel()])
        z1 = z1.reshape(xx1.shape)
        plt.contour(xx1,yy1,z1,levels=[0],linewidths=2,colors='r')

        # plot
        plt.scatter(data[:,0],data[:,1],color= 'black')
        plt.title("Outlier detection")
        plt.xlim((xx1.min(), xx1.max()))
        plt.ylim((yy1.min(), yy1.max()))
        plt.pause(.001)
        # plt.show()
        
        cv2.imshow('rg',img1/np.amax(img1))
        cv2.imshow('by',img2/np.amax(img2))
开发者ID:DriesDries,项目名称:shangri-la,代码行数:51,代码来源:anomaly.py


示例9: find_outlier_test_homes

def find_outlier_test_homes(df,all_homes,  appliance, outlier_features, outliers_fraction=0.1):
    from scipy import stats

    from sklearn import svm
    from sklearn.covariance import EllipticEnvelope
    clf = EllipticEnvelope(contamination=.1)
    try:
        X = df.ix[all_homes[appliance]][outlier_features].values
        clf.fit(X)
    except:
        try:
            X = df.ix[all_homes[appliance]][outlier_features[:-1]].values
            clf.fit(X)
        except:
            try:
                X = df.ix[all_homes[appliance]][outlier_features[:-2]].values
                clf.fit(X)
            except:
                print "outlier cannot be found"
                return df.ix[all_homes[appliance]].index.tolist()


    y_pred = clf.decision_function(X).ravel()
    threshold = stats.scoreatpercentile(y_pred,
                                        100 * outliers_fraction)
    y_pred = y_pred > threshold
    return df.ix[all_homes[appliance]][~y_pred].index.tolist()
开发者ID:nipunbatra,项目名称:Gemello,代码行数:27,代码来源:all_functions.py


示例10: ellipticenvelope

def ellipticenvelope(data, fraction = 0.02):
    elenv = EllipticEnvelope(contamination=fraction)
    elenv.fit(data)
    score = elenv.predict(data)

    numeration = [[i] for i in xrange(1, len(data)+1, 1)]
    numeration = np.array(numeration)
    y = np.hstack((numeration, score))

    anomalies = numeration
    for num,s in y:
        if (y == 1):
            y = np.delete(anomalies, num-1, axis=0)

    return anomalies
开发者ID:bondarchukYV,项目名称:AD,代码行数:15,代码来源:ellipticenvelope.py


示例11: elliptic_envelope

def elliptic_envelope(df, modelDir, norm_confidence=0.95):
	from sklearn.covariance import EllipticEnvelope
	from scipy.stats import normaltest

	if "ds" in df.columns:
		del df["ds"]
	model = EllipticEnvelope()
	test_stats, p_vals = normaltest(df.values, axis=0)
	normal_cols = p_vals >= (1 - norm_confidence)
	df = df.loc[:, normal_cols]
	if df.shape[1] == 0:
		return None
	df.outlier = model.fit_predict(df.values)
	df.outlier = df.outlier < 0  # 1 if inlier, -1 if outlier
	return df
开发者ID:dpinney,项目名称:omf,代码行数:15,代码来源:anomalyDetection.py


示例12: labelValidSkeletons

def labelValidSkeletons(skel_file, valid_index, trajectories_data, fit_contamination = 0.05):
    #calculate valid widths if they were not used
    calculate_widths(skel_file)
    
    #calculate classifier for the outliers    
    X4fit = nodes2Array(skel_file, valid_index)        
    clf = EllipticEnvelope(contamination = fit_contamination)
    clf.fit(X4fit)
    
    #calculate outliers using the fitted classifier
    X = nodes2Array(skel_file) #use all the indexes
    y_pred = clf.decision_function(X).ravel() #less than zero would be an outlier

    #labeled rows of valid individual skeletons as GOOD_SKE
    trajectories_data['auto_label'] = ((y_pred>0).astype(np.int))*wlab['GOOD_SKE'] #+ wlab['BAD']*np.isnan(y_prev)
    saveLabelData(skel_file, trajectories_data)
开发者ID:ver228,项目名称:Work_In_Progress,代码行数:16,代码来源:getFilteredFeats_N.py


示例13: test_score_samples

def test_score_samples():
    X_train = [[1, 1], [1, 2], [2, 1]]
    clf1 = EllipticEnvelope(contamination=0.2).fit(X_train)
    clf2 = EllipticEnvelope().fit(X_train)
    assert_array_equal(clf1.score_samples([[2., 2.]]),
                       clf1.decision_function([[2., 2.]]) + clf1.offset_)
    assert_array_equal(clf2.score_samples([[2., 2.]]),
                       clf2.decision_function([[2., 2.]]) + clf2.offset_)
    assert_array_equal(clf1.score_samples([[2., 2.]]),
                       clf2.score_samples([[2., 2.]]))
开发者ID:AlexisMignon,项目名称:scikit-learn,代码行数:10,代码来源:test_elliptic_envelope.py


示例14: labelValidSkeletons

def labelValidSkeletons(skel_file):
    calculate_widths(skel_file)
    
    #get valid rows using the trajectory displacement and the skeletonization success
    valid_index, trajectories_data = getValidIndexes(skel_file)
    
    #calculate classifier for the outliers    
    X4fit = nodes2Array(skel_file, valid_index)        
    clf = EllipticEnvelope(contamination=.1)
    clf.fit(X4fit)
    
    #calculate outliers using the fitted classifier
    X = nodes2Array(skel_file)
    y_pred = clf.decision_function(X).ravel() #less than zero would be an outlier

    #labeled rows of valid individual skeletons as GOOD_SKE
    trajectories_data['auto_label'] = ((y_pred>0).astype(np.int))*wlab['GOOD_SKE'] #+ wlab['BAD']*np.isnan(y_prev)
    saveLabelData(skel_file, trajectories_data)
开发者ID:ver228,项目名称:Work_In_Progress,代码行数:18,代码来源:extract_feat.py


示例15: test_outlier_detection

def test_outlier_detection():
    rnd = np.random.RandomState(0)
    X = rnd.randn(100, 10)
    clf = EllipticEnvelope(contamination=0.1)
    assert_raises(NotFittedError, clf.predict, X)
    assert_raises(NotFittedError, clf.decision_function, X)
    clf.fit(X)
    y_pred = clf.predict(X)
    decision = clf.decision_function(X, raw_values=True)
    decision_transformed = clf.decision_function(X, raw_values=False)

    assert_array_almost_equal(decision, clf.mahalanobis(X))
    assert_array_almost_equal(clf.mahalanobis(X), clf.dist_)
    assert_almost_equal(clf.score(X, np.ones(100)), (100 - y_pred[y_pred == -1].size) / 100.0)
    assert sum(y_pred == -1) == sum(decision_transformed < 0)
开发者ID:BTY2684,项目名称:scikit-learn,代码行数:15,代码来源:test_robust_covariance.py


示例16: test_outlier_detection

def test_outlier_detection():
    rnd = np.random.RandomState(0)
    X = rnd.randn(100, 10)
    clf = EllipticEnvelope(contamination=0.1)
    clf.fit(X)
    y_pred = clf.predict(X)

    assert_array_almost_equal(
        clf.decision_function(X, raw_values=True), clf.mahalanobis(X))
    assert_array_almost_equal(clf.mahalanobis(X), clf.dist_)
    assert_almost_equal(clf.score(X, np.ones(100)),
                        (100 - y_pred[y_pred == -1].size) / 100.)
开发者ID:emanuele,项目名称:scikit-learn,代码行数:12,代码来源:test_robust_covariance.py


示例17: labelValidSkeletons_old

def labelValidSkeletons_old(skeletons_file, good_skel_row, fit_contamination = 0.05):
    base_name = getBaseName(skeletons_file)
    progress_timer = timeCounterStr('');
    
    print_flush(base_name + ' Filter Skeletons: Starting...')
    with pd.HDFStore(skeletons_file, 'r') as table_fid:
        trajectories_data = table_fid['/trajectories_data']

    trajectories_data['is_good_skel'] = trajectories_data['has_skeleton']
    
    if good_skel_row.size > 0:
        #nothing to do if there are not valid skeletons left. 
        
        print_flush(base_name + ' Filter Skeletons: Reading features for outlier identification.')
        #calculate classifier for the outliers    
        
        nodes4fit = ['/skeleton_length', '/contour_area'] + \
        ['/' + name_width_fun(part) for part in worm_partitions]
        
        X4fit = nodes2Array(skeletons_file, nodes4fit, good_skel_row)
        assert not np.any(np.isnan(X4fit))
        
        #%%
        print_flush(base_name + ' Filter Skeletons: Fitting elliptic envelope. Total time:' + progress_timer.getTimeStr())
        #TODO here the is a problem with singular covariance matrices that i need to figure out how to solve
        clf = EllipticEnvelope(contamination = fit_contamination)
        clf.fit(X4fit)
        
        print_flush(base_name + ' Filter Skeletons: Calculating outliers. Total time:' + progress_timer.getTimeStr())
        #calculate outliers using the fitted classifier
        X = nodes2Array(skeletons_file, nodes4fit) #use all the indexes
        y_pred = clf.decision_function(X).ravel() #less than zero would be an outlier

        print_flush(base_name + ' Filter Skeletons: Labeling valid skeletons. Total time:' + progress_timer.getTimeStr())
        #labeled rows of valid individual skeletons as GOOD_SKE
        trajectories_data['is_good_skel'] = (y_pred>0).astype(np.int)
    
    #Save the new is_good_skel column
    saveModifiedTrajData(skeletons_file, trajectories_data)

    print_flush(base_name + ' Filter Skeletons: Finished. Total time:' + progress_timer.getTimeStr())
开发者ID:KezhiLi,项目名称:Multiworm_Tracking,代码行数:41,代码来源:getFilteredSkels.py


示例18: test_elliptic_envelope

def test_elliptic_envelope():
    rnd = np.random.RandomState(0)
    X = rnd.randn(100, 10)
    clf = EllipticEnvelope(contamination=0.1)
    assert_raises(NotFittedError, clf.predict, X)
    assert_raises(NotFittedError, clf.decision_function, X)
    clf.fit(X)
    y_pred = clf.predict(X)
    scores = clf.score_samples(X)
    decisions = clf.decision_function(X)

    assert_array_almost_equal(
        scores, -clf.mahalanobis(X))
    assert_array_almost_equal(clf.mahalanobis(X), clf.dist_)
    assert_almost_equal(clf.score(X, np.ones(100)),
                        (100 - y_pred[y_pred == -1].size) / 100.)
    assert(sum(y_pred == -1) == sum(decisions < 0))
开发者ID:AlexisMignon,项目名称:scikit-learn,代码行数:17,代码来源:test_elliptic_envelope.py


示例19: detect_outliers

def detect_outliers(X, station):
    if station=='hoerning':
            outlierfraction = 0.0015
            classifier = svm.OneClassSVM(nu=0.95*outlierfraction + 0.05,
                                         kernel='rbf', gamma=0.1)
            Xscaler = StandardScaler(copy=True, with_mean=True, with_std=True).fit(X)
            X_scaled = Xscaler.transform(X)
            classifier.fit(X_scaled)
            svcpred = classifier.decision_function(X_scaled).ravel()
            threshold = stats.scoreatpercentile(svcpred, 100*outlierfraction)
            inlierpred = svcpred>threshold        
            
    else:
        outlierfraction = 0.0015
        classifier = EllipticEnvelope(contamination=outlierfraction)
        classifier.fit(X)
        gausspred = classifier.decision_function(X).ravel()
        threshold = stats.scoreatpercentile(gausspred, 100*outlierfraction)
        inlierpred = gausspred>threshold
            
    return inlierpred
开发者ID:magndahl,项目名称:dmi_ensemble_handler,代码行数:21,代码来源:heat_exchanger_model.py


示例20: transform

def transform( features, labels ):

#    for ff, ll in zip(features, labels):
#        print ll, ff
#    for rr in range(0, len(features) ):
#        features[rr] = scaler.fit_transform( features[rr] )

    print "transforming features via pca"
    pca = PCA(n_components = 30)
    features = pca.fit_transform( features )

    envelope = EllipticEnvelope()
    envelope.fit( features )
    print envelope.predict( features )

    scaler = MinMaxScaler()
    features = scaler.fit_transform( features )



    return features, labels
开发者ID:cmmalone,项目名称:um_detector,代码行数:21,代码来源:reader.py



注:本文中的sklearn.covariance.EllipticEnvelope类示例由纯净天空整理自Github/MSDocs等源码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。


鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
上一篇:
Python covariance.EmpiricalCovariance类代码示例发布时间:2022-05-27
下一篇:
Python covariance.empirical_covariance函数代码示例发布时间:2022-05-27
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap