在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
开源软件名称(OpenSource Name):Olow304/Data-Science-Machine-Learning开源软件地址(OpenSource Url):https://github.com/Olow304/Data-Science-Machine-Learning开源编程语言(OpenSource Language):Jupyter Notebook 92.8%开源软件介绍(OpenSource Introduction):Complete-Data-Science-ToolkitsThe overall objective of this toolkit is to provide and offer a free collection of data analysis and machine learning that is specifically suited for doing data science. Its purpose is to get you started in a matter of minutes. You can run this collections either in Jupyter notebook or python alone. FeaturesMachine Learning
Numpy
Pandas
Visualization
Naming Conventions
Code Samples:
from sklearn.model_selection import cross_val_score
model = SVC(kernel='linear', C=1)
# let's try it using cv
scores = cross_val_score(model, X, y, cv=5)
from sklearn.grid_search import GridSearchCV
params = {"n_neighbors": np.arange(1,5), "metric": ["euclidean", "cityblock"]}
grid = GridSearchCV(estimator=knn, param_grid=params)
grid.fit(X_train, y_train)
print(grid.best_score)
print(grid.best_estimator_.n_neighbors)
from sklearn.preprocessing import Imputer
impute = Imputer(missing_values = 0, strategy='mean', axis=0)
impute.fit_transform(X_train)
from sklearn.grid_search import RandomizedSearchCV
params = {"n_neighbors" : range(1,5), "weights": ["uniform", "distance"]}
rsearch = RandomizedSearchCV(estimator=knn, param_distributions=params, cv=4, n_iter=8, random_state=5)
rsearch.fit(X_train, y_train)
print(rsearch.best_score_)
#supervised learning
from sklearn import neighbors
knn = neighbors.KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train, y_train)
#unsupervised learning
from sklearn.decomposition import PCA
pca = PCA(n_components=0.95)
pca_model = pca.fit_transform(X_train)
import numpy as np
#appends values to end of arr
np.append(arr, values)
#inserts values into arr before index 2
np.insert(arr, 2, values)
import numpy as np
#return the element at index 5
arr = np.array([[1,2,3,4,5,6,7]])
arr[5]
#returns the 2D array element on index
arr[2,5]
#assign array element on index 1 the value 4
arr[1] = 4
#assign array element on index [1][3] the value 10
arr[1,3] = 10
import pandas as pd
#specify values for each rows and columns
df = pd.DataFrame(
[[4,7,10],
[5,8,11],
[6,9,12]],
index=[1,2,3],
columns=['a','b','c'])
import pandas as pd
import pandas as pd
#return a groupby object, grouped by values in column named 'cities'
df.groupby(by="Cities")
import pandas as pd
#drop rows with any column having NA/null data.
df.dropna()
#replace all NA/null data with value
df.fillna(value)
import pandas as pd
#most pandas methods return a DataFrame so that
#this improves readability of code
df = (pd.melt(df)
.rename(columns={'old_name':'new_name', 'old_name':'new_name'})
.query('new_name >= 200')
)
mport matplotlib.pyplot as plt
#saves plot/figure to image
plt.savefig('pic_name.png')
import matplotlib.pyplot as plt
#add * for every data point
plt.plot(x,y, marker='*')
#adds dot for every data point
plt.plot(x,y, marker='.')
import matplotlib.pyplot as plt
#a container that contains all plot elements
fig = plt.figures()
#Initializes subplot
fig.add_axes()
#A subplot is an axes on a grid system, rows-cols num
a = fig.add_subplot(222)
#adds subplot
fig, b = plt.subplots(nrows=3, ncols=2)
#creates subplot
ax = plt.subplots(2,2)
import matplotlib.pyplot as plt
#places text at coordinates 1/1
plt.text(1,1, 'Example text', style='italic')
#annotate the point with coordinates xy with text
ax.annotate('some annotation', xy=(10,10))
#just put math formula
plt.title(r'$delta_i=20$',fontsize=10) |
2023-10-27
2022-08-15
2022-08-17
2022-09-23
2022-08-13
请发表评论