Ìܼ¡

³µÍ×

¤³¤³¤Ç¤Ï¡¢ºÇµÞ¹ß²¼Ë¡¤ò»ÈÍѤ·¤Æ½Å²óµ¢Ê¬ÀϤò½ÐÍè¤ë¤À¤±¥é¥¤¥Ö¥é¥êÅù¤ò»ÈÍѤ»¤º¤Ë¥¹¥¯¥é¥Ã¥Á¤Ç¼ÂÁõ¤·¤Æ¤ß¤ë¡£

¥â¥Ç¥ë¼°

ñ²óµ¢Ê¬ÀϤΥâ¥Ç¥ë¼°¤¬¡¡ y^ = w1 x + w0 ¡¡¤Ç¤¢¤ë¤Î¤ËÂФ·¤Æ¡¢
ÆÃħÎ̤¬Ê£¿ô¤¢¤ë½Å²óµ¢Ê¬ÀϤξì¹ç¤Î¥â¥Ç¥ë¼°¤Ï¡¡ y^ = w1 x1 + w2 x2 + w0 x0 ¡¡¤È¤Ê¤ë¡£¡Ê ¹ÔÎó·×»»ÍÑ¤Ë x0=1 ¤È¤¹¤ë ¡Ë

°Ê²¼¤Î¤è¤¦¤Ê¥Ç¡¼¥¿¤¬¤¢¤ë»þ¡¢

x1x2y
706887
758286
807193
1¹ÔÌÜ¤Ï y^ = w1 * 70 + w2 * 68 + w0 * 1 ¤È¤Ê¤ê¡¢
°Ê²¼¤Î¹ÔÎó·×»»¤Ç y^ ¤òµá¤á¤ë»ö¤¬¤Ç¤­¤ë¡£
70681 75821 80711 w1 w2 w0 = 70*w1+68*w2+1*w0 75*w1+82*w2+1*w0 80*w1+71*w2+1*w0
¤³¤Î°Ù¡¢ÆÃħÎ̤ιÔÎó¤òX¡¢¥Ñ¥é¥á¡¼¥¿¤Î¹ÔÎó¤òW¤È¤·¤¿¾ì¹ç¡¢°Ê²¼¤Î¥â¥Ç¥ë¼°¤Çɽ¤¹»ö¤¬½ÐÍè¤ë¡£
y^ = X W

¥³¥¹¥È´Ø¿ô

¥³¥¹¥È´Ø¿ô¤Ï°Ê²¼¤ÎÄ̤ꡣ

j(W) = 12m i=1m ( y^ - yi ) 2 = 12m (XW-y) T (XW-y)

ºÇµÞ¹ß²¼Ë¡

ºÇµÞ¹ß²¼Ë¡¡Ê¥Ù¥¯¥È¥ë²½¡Ë¤Î¥â¥Ç¥ë¼°¤Ï°Ê²¼¤ÎÄ̤ꡣ

W := W - ¦Á 1m XT (XW-y)

Àµµ¬²½

ÆÃħÎ̤ÎÈϰϤ¬Î󤴤Ȥ˰ۤʤë¾ì¹ç¡¢¥Ç¡¼¥¿¤òÀµµ¬²½¤¹¤ë»ö¤Ë¤è¤Ã¤Æ¡¢Àµ¤·¤¯Ê¬ÀϤò¹Ô¤¦»ö¤¬¤Ç¤­¤ë¡£

z-socre Normalization (ɸ½à²½)

Ê¿¶Ñ¤¬0¡¢É¸½àÊк¹¤¬1¤È¤Ê¤ë¤è¤¦¤ËÄ´À°¤¹¤ë¡£

x1 = x1-xmean xstd

min-max normalization

ºÇÂçÃͤ¬1ºÇ¾®Ãͤ¬0¤È¤Ê¤ë¤è¤¦¤ËÄ´À°¤¹¤ë¡£

x1 = x1-xmean xmax-xmin

¥µ¥ó¥×¥ë¼ÂÁõ

°Ê²¼¤Î¥Ç¡¼¥¿¤«¤é¡¢Ç¤°Õ¤ÎÉô²°ÌÌÀѤÈÃÛǯ¿ô¤Î»þ¤Ë²ÈĤ¬¤¤¤¯¤é¤Ë¤Ê¤ë¤«¤ò¡¢½Å²óµ¢Ê¬ÀϤò»ÈÍѤ·¤Æ¿äÏÀ¤¹¤ë½èÍý¤ò¼ÂÁõ¤¹¤ë¡£

ÌÌÀÑÃÛǯ¿ô²ÈÄÂ
40.2476.2
72.4849.3
72.4849.3
43.8507.5
88.33013.3

¢¨ filesample_rent1.csv

sample_rent1.py

"""½Å²óµ¢Ê¬ÀÏ¥µ¥ó¥×¥ë."""
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

def normalize(X, data=None):
    """Àµµ¬²½(Z-score normalization)."""
    data = X if data is None else data
    m = data.shape[0]
    X_norm = np.zeros((data.shape[0], data.shape[1]))
    for i in range(data.shape[1]):
        X_norm[:, i] = (data[:, i] - float(np.mean(X[:, i]))) / float(np.std(X[:, i]))

    # x0¤òÄɲÃ
    X_norm = np.column_stack([np.ones([m,1]), X_norm])

    return X_norm

def cost(x, y, w):
    """¥³¥¹¥È´Ø¿ô."""
    xw = x.dot(w)
    return np.dot((xw - y).T, (xw - y)) / (2*m)
    # ¤â¤·¤¯¤Ï
    #diff = np.power((x.dot(w) - y), 2)
    #return diff.sum(axis=0) / (2 * len(y))

def gradient_descent(x, y, w, alpha, iter_num):
    """ºÇµÞ¹ß²¼Ë¡."""
    m = len(y)
    costs = np.zeros((iter_num, 1))
    for i in range(iter_num):
        w = w - alpha * (1.0/m) * np.transpose(x).dot(x.dot(w) - y)
        costs[i] = cost(x, y, w)
    return w, costs

if __name__ == "__main__":

    # --------------------------
    # ºÇµÞ¹ß²¼Ë¡¤Ë¤è¤ë½Å²óµ¢Ê¬ÀÏ
    # --------------------------

    # ¥Ç¡¼¥¿Æɤ߹þ¤ß
    data = np.loadtxt("data/sample_rent1.csv", delimiter=",", skiprows=1)
    x = data[:, 1:3]
    y = data[:, 3:4]

    # ¥Ç¡¼¥¿¤Î¸Ä¿ô
    m = len(y)

    #  Àµµ¬²½¤·¡¢x0¤òÄɲÃ
    X_norm = normalize(x)

    # ½é´üÃÍ
    w_int = np.zeros((3, 1))

    # ³Ø½¬Î¨
    alpha = 0.01

    # ³Ø½¬²ó¿ô
    iter_num = 1000

    # ºÇµÞ¹ß²¼Ë¡¤Ë¤è¤ëʬÀϤμ¹Ô
    w, costs = gradient_descent(X_norm, y, w_int, alpha, iter_num)

    # --------------------------
    # ºîÀ®¤·¤¿¥â¥Ç¥ë¤ò»ÈÍѤ·¤ÆÊ̤Υǡ¼¥¿¤òͽ¬¤·¤Æ¤ß¤ë
    # --------------------------
    z = np.array([[60, 10], [50, 10], [40, 10]])
    result = normalize(x, z).dot(w)
    for i in range(z.shape[0]):
        print("¹­¤µ: {}­Ö, ÃÛǯ¿ô: {}ǯ ¢â {:0.1f}Ëü±ß".format(z[i,0], z[i,1], result[i,0]))

    # --------------------------
    # ¥°¥é¥Õɽ¼¨
    # --------------------------
    fig = plt.figure(figsize=(10, 5))

    # 3D¥°¥é¥Õ
    ax = fig.add_subplot(1, 2, 1, projection='3d')
    ax.scatter(data[:, 1], data[:, 2], data[:, 3], color="#ef1234")
    ax.set_xlabel("Size")
    ax.set_ylabel("Age")
    ax.set_zlabel("Rent")

    # ³Ø½¬Ëè¤Î¥³¥¹¥È
    ax2 = fig.add_subplot(1, 2, 2)
    ax2.plot(range(costs.size), costs[:, 0], "r")
    ax2.set_xlabel("iterations")
    ax2.set_ylabel("cost")
    ax2.grid(True)
    plt.show()

·ë²Ì

¹­¤µ: 60­Ö, ÃÛǯ¿ô: 10ǯ ¢â 8.1Ëü±ß
¹­¤µ: 50­Ö, ÃÛǯ¿ô: 10ǯ ¢â 7.2Ëü±ß
¹­¤µ: 40­Ö, ÃÛǯ¿ô: 10ǯ ¢â 6.3Ëü±ß

sample_rent1.png


źÉÕ¥Õ¥¡¥¤¥ë: filesample_rent1.png 654·ï [¾ÜºÙ] filesample_rent1.csv 682·ï [¾ÜºÙ]

¥È¥Ã¥×   º¹Ê¬ ¥Ð¥Ã¥¯¥¢¥Ã¥× ¥ê¥í¡¼¥É   °ìÍ÷ ñ¸ì¸¡º÷ ºÇ½ª¹¹¿·   ¥Ø¥ë¥×   ºÇ½ª¹¹¿·¤ÎRSS
Last-modified: 2019-11-22 (¶â) 07:49:12 (1615d)