로지스틱 회귀 (데이터 전처리 관련)

Question

로지스틱 회귀 (데이터 전처리 관련)

조회수 661회

logistic-regression

machine-learning

0

싫어요

안녕하세요.! 지금 로지스틱 회귀 배우고 있는데, 모델을 만들기전 데이터 전처리를 어떻게 해야할지 감이 안잡혀서 글 남깁니다.

일단 코드는 아래와 같이 작성해 봤는데, 정확도를 더 높일 방법이 있을까요? 처음이라서 제가 지금 뭘 하고있는지 대충은 알겠는데 정확히는 또 모르겠네요... ㅠㅠ

아래는 풀고있는 문제 내용입니다... An automated answer-rating site marks each post in a community forum website as “good” or “bad” based on the quality of the post. The CSV file, which you can download from OA 9.14, contains the various types of quality as measured by the tool. Following are the type of qualities that the dataset contains: i. num_words: number of words in the post ii. num_characters: number of characters in the post iii. num_misspelled: number of misspelled words iv. bin_end_qmark: if the post ends with a question mark v. num_interrogative: number of interrogative words in the post vi. bin_start_small: if the answer starts with a lowercase letter (“1” means yes, otherwise no) vii. num_sentences: number of sentences per post viii. num_punctuations: number of punctuation symbols in the post ix. label: the label of the post (“G” for good and “B” for bad) as determined by the tool. Create a logistics regression model to predict the class label from the first eight attributes of the question set. Evaluate the accuracy of your model.

이건 quality.csv 파일 이고요 https://github.com/nam14d/imt574_conglomorate/blob/806fd329af1672e08827367ba044263703bcee49/Assignment3_wine_quality/quality.csv

이건 제가 지금까지 작성한 코드입니다,,!

import pandas as pd
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

qual = pd.read_csv("./quality.csv")
## Log Regression Model - X: First Eight
qual['label'] = np.where(qual['label'] == 'B', 0, 1)
X = qual.drop(['S.No.','label'], axis=1)
X_normalized = X.apply(lambda x: (x-min(x))/max(x)-min(x))
y = qual['label']
Xtrain, Xtest, ytrain, ytest = train_test_split(X,y,test_size = 0.2)

logmod = LogisticRegression()
logmod.fit(Xtrain, ytrain)

predictions = logmod.predict(Xtest)

print(accuracy_score(ytest, predictions))

감자 0 points

2023-12-08 09:09:25에 작성됨

댓글 입력

Answer 1

로지스틱 회귀 (데이터 전처리 관련)

조회수 661회

logistic-regression

machine-learning

0

감자 0 points

2023-12-08 09:09:25에 작성됨

댓글 입력

1 답변

0

rahon6000@gmail.com 55 points

2023-12-10 13:45:22에 작성됨

댓글 달기

로지스틱 회귀 (데이터 전처리 관련)

조회수 661회

logistic-regression

machine-learning

0

감자 0 points

2023-12-08 09:09:25에 작성됨

댓글 입력

1 답변

0

rahon6000@gmail.com 55 points

2023-12-10 13:45:22에 작성됨

댓글 달기

답변을 하려면 로그인이 필요합니다.