딥러닝 실습

Notice

Recent Posts

Recent Comments

Link

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

데이터 한 그릇

딥러닝 실습 본문

딥러닝/모두의 딥러닝

딥러닝 실습

장사이언스 2022. 1. 5. 14:36

엄청 간단한 EDA 진행 이후 딥러닝 MODEL 만들 예정

피마 인디언 당뇨병 예측

EDA

import pandas as pd

pima_df = pd.read_csv('C:\\Users\\user\\Desktop\\080228\\deeplearning\\dataset\\pima-indians-diabetes.csv',
                        names = ['pregnant','plasma','pressure','thickness','insulin','BMI','pedigree','age','class'],
                       header = None)


pima_df.info()

import seaborn as sns
import matplotlib.pyplot as plt

plt.figure(figsize = (15,15))
sns.heatmap(pima_df.corr(), linewidths = 0.1, vmax = 0.5, cmap = plt.cm.gist_heat, linecolor = 'white', annot = True)
plt.show()

MODEL 생성 및 학습

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
import numpy as np
import pandas as pd

X = pima_df.iloc[:,0:8]
y = pima_df.iloc[:,8]

model = Sequential()
model.add(Dense(12, input_dim = pima_df.shape[1] - 1,activation = 'relu'))
model.add(Dense(8, activation = 'relu'))
model.add(Dense(1, activation = 'sigmoid'))


model.compile(loss = 'binary_crossentropy',
             optimizer = 'adam',
             metrics = ['accuracy'])

model.fit(X,y, epochs = 200, batch_size = 10)

print('Accuracy : {:.4f}'.format(model.evaluate(X,y)[1]))

MODEL Accuracy : 0.7773

아이리스 CSV 다중 분류 문제 해결 실습

EDA

import pandas as pd

df = pd.read_csv('C:\\Users\\user\\Desktop\\080228\\deeplearning\\dataset\\iris.csv',
                        names = ['sepal_length','sepal_width','petal_length','petal_width','species'],
                       )

sns.pairplot(df, hue = 'species')
plt.show()

모델 생성 및 학습

from sklearn.preprocessing import LabelEncoder
import tensorflow as tf

dataset = df.values
X = dataset[:,0:4].astype(float)
y_obj = dataset[:,4]

e = LabelEncoder()
e.fit(y_obj)
Y = e.transform(y_obj)

#활성화 함수를 적용하려면 0,1 로 이루어져 있어야 한다.
#여러 개의 타겟 값이 0과 1로 변환됐는데, 이를 원-핫 인코딩이라고 한다.

y_encoded = tf.keras.utils.to_categorical(Y)

#모델 생성 및 학습

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

model = Sequential()
model.add(Dense(16, input_dim = 4, activation = 'relu'))
model.add(Dense(3, activation  = 'softmax')) #3개의 값 중 하나로 도출되어야 하기 때문에 softmax 함수 사용 (입력 값이 3개 이상)

model.compile(loss = 'categorical_crossentropy', optimizer = 'adam',metrics = ['accuracy'])

model.fit(X, y_encoded, epochs = 50, batch_size = 1)

print('Accuracy : {:.4f}'.format(model.evaluate(X, y_encoded)[1]))

Accuracy : 0.9800

주의 :

활성화 함수를 적용하기 위해서는 LABEL 들이 0 OR 1이여만 한다.
tensorflow 에 원 핫 인코딩을 실행하는 함수가 존재한다. (물론 사이킷런에도 존재한다.)
다중 분류 문제이기 때문에 출력층의 노드가 3개이다. 활성화 함수는 sigmoid가 아니라 softmax
오차공식이 다중 분류에 적합한 categorical_crossentropy 이다. binary_crossentropy 는 이진 분류일 때 사용한다.

'딥러닝 > 모두의 딥러닝' 카테고리의 다른 글

딥러닝 베스트 모델 만들기 (0)	2022.01.05
딥러닝 과적합 피하기 (0)	2022.01.05
딥러닝 모델 설계하기 (0)	2022.01.05
신경망에서 딥러닝으로 (0)	2022.01.04
딥러닝의 동작 원리(1) - 선형 회귀 (0)	2022.01.04

'딥러닝/모두의 딥러닝' Related Articles

Comments

데이터 한 그릇

딥러닝 실습 본문

딥러닝 실습

피마 인디언 당뇨병 예측

아이리스 CSV 다중 분류 문제 해결 실습

'딥러닝 > 모두의 딥러닝' 카테고리의 다른 글

티스토리툴바