If you are totally new to tensorflow like me, this may help you.
I tried to add those basic detail in this tutorial.
This tutorial is from TF2.0 official website.
The sample is about when you have a CSV with specific data structure.
How can you handle your data?
The dataset were provided by TF2.0 tutorial, which were an online dataset.
from __future__ import absolute_import, division, print_function, unicode_literals
import numpy as np
import pandas as pd
import tensorflow as tf
print(tf.__version__)
from tensorflow import feature_column
from tensorflow.keras import layers
from sklearn.model_selection import train_test_split
2.0.0-alpha0
URL = 'https://storage.googleapis.com/applied-dl/heart.csv'
data = pd.read_csv(URL)
data.head()
train, test = train_test_split(data, test_size=0.2)
train, val = train_test_split(train, test_size=0.2)
print(len(train), ' ', len(val), ' ', len(test))
193 49 61
Here the sample introduce you an API calls tf.data
it allows you build complex input pipelines.
there are 2 ways to create this kind of dataset:
- ataset.from_tensor_slices()
this method constructs a dataset from Tensor objects.
- ataset.batch()
this method constructs a dataset from Dataset objects.
when extract elements from a dataset, use
data.Iterator
you can use Iterator.get_next() to get next objs.
find detail on their official website.
The convient about tf.data is helping you transfor & normalize your data structure.
for example, when classification, you won't use id directly,
you will use one-hot for your items.