728x90
반응형
모델 : decision_tree ,random_forest, svm_model, sgd_model, logistic_model
정답지가 있는 데이터 기반으로 학습 및 예측
예측한 결과를 답지와 비교를 통해 정확도를 파악한다.
1. 데이터 준비
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
import pandas as pd
# 데이터 불러오기
bc = load_breast_cancer()
bc_data = bc.data
bc_label = bc.target
# 데이터 정리
bc_df = pd.DataFrame(data=bc_data, columns=bc.feature_names)
bc_df["label"] = bc.target
2. 모델준비 및 학습
from sklearn.model_selection import train_test_split
# trainig dataset, test dataset
X_train, X_test, y_train, y_test = train_test_split(bc_data,
bc_label,
test_size=0.2,
random_state=7)
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn import svm
from sklearn.linear_model import SGDClassifier
from sklearn.linear_model import LogisticRegression
#모델준비
decision_tree = DecisionTreeClassifier(random_state=32)
random_forest = RandomForestClassifier(random_state=32)
svm_model = svm.SVC()
sgd_model = SGDClassifier()
logistic_model = LogisticRegression()
model_list = [decision_tree ,random_forest, svm_model, sgd_model, logistic_model]
#학습
for model in model_list :
model.fit(X_train, y_train)
3. 예측
y_pred_list = []
for model in model_list :
y_pred = model.predict(X_test)
y_pred_list.append(y_pred)
4. 정확도
from sklearn.metrics import accuracy_score
acc_list = []
for y_pred in y_pred_list :
accuracy = accuracy_score(y_test, y_pred)
acc_list.append(accuracy)
class_report_list = [];
from sklearn.metrics import classification_report
for y_pred in y_pred_list :
class_report_list.append(classification_report(y_test, y_pred))
728x90
반응형
'Data Science' 카테고리의 다른 글
데이터사이언스 찍먹해보기 (0) | 2023.03.27 |
---|---|
데이터 전처리 (Data Cleaning) - python (0) | 2023.03.06 |
[R] Logistic Regression로 classification(분류)해보기 (0) | 2018.06.14 |
[R] kNN 사용해보기 (0) | 2018.06.06 |
댓글