網(wǎng)站首頁編程語言正文

python神經(jīng)網(wǎng)絡tfrecords文件的寫入讀取及內容解析_python

作者：Bubbliiiing ? 更新時間： 2022-06-28 編程語言

學習前言

前一段時間對SSD預測與訓練的整體框架有了一定的了解，但是對其中很多細節(jié)還是把握的不清楚。今天我決定好好了解以下tfrecords文件的構造。

tfrecords格式是什么

tfrecords是一種二進制編碼的文件格式，tensorflow專用。能將任意數(shù)據(jù)轉換為tfrecords。更好的利用內存，更方便復制和移動，并且不需要單獨的標簽文件。

之所以使用到tfrecords格式是因為當今數(shù)據(jù)爆炸的情況下，使用普通的數(shù)據(jù)格式不僅麻煩，而且速度慢，這種專門為tensorflow定制的數(shù)據(jù)格式可以大大增快數(shù)據(jù)的讀取，而且將所有內容規(guī)整，在保證速度的情況下，使得數(shù)據(jù)更加簡單明晰。

tfrecords的寫入

這個例子將會講述如何將MNIST數(shù)據(jù)集寫入到tfrecords，本次用到的MNIST數(shù)據(jù)集會利用tensorflow原有的庫進行導入。

from tensorflow.examples.tutorials.mnist import input_data
# 讀取MNIST數(shù)據(jù)集
mnist = input_data.read_data_sets('./MNIST_data', dtype=tf.float32, one_hot=True)

對于MNIST數(shù)據(jù)集而言，其中的訓練集是mnist.train，而它的數(shù)據(jù)可以分為images和labels，可通過如下方式獲得。

# 獲得image，shape為(55000,784)
images = mnist.train.images
# 獲得label，shape為(55000,10)
labels = mnist.train.labels
# 獲得一共具有多少張圖片
num_examples = mnist.train.num_examples

接下來定義存儲TFRecord文件的地址，同時創(chuàng)建一個writer來寫TFRecord文件。

# 存儲TFRecord文件的地址
filename = 'record/output.tfrecords'
# 創(chuàng)建一個writer來寫TFRecord文件
writer = tf.python_io.TFRecordWriter(filename)

此時便可以按照一定的格式寫入了，此時需要對每一張圖片進行循環(huán)并寫入，在tf.train.Features中利用features字典定義了數(shù)據(jù)保存的方式。以image_raw為例，其經(jīng)過函數(shù)_float_feature處理后，存儲到tfrecords文件的’image/encoded’位置上。

# 將每張圖片都轉為一個Example，并寫入
for i in range(num_examples):
    image_raw = images[i]  # 讀取每一幅圖像
    image_string = images[i].tostring()
    example = tf.train.Example(
        features=tf.train.Features(
            feature={
                'image/class/label': _int64_feature(np.argmax(labels[i])),
                'image/encoded': _float_feature(image_raw),
                'image/encoded_tostring': _bytes_feature(image_string)
            }
        )
    )
    print(i,"/",num_examples)
    writer.write(example.SerializeToString())  # 將Example寫入TFRecord文件

在最終存入前，數(shù)據(jù)還需要經(jīng)過處理，處理方式如下：

# 生成整數(shù)的屬性
def _int64_feature(value):
    if not isinstance(value,list) and not isinstance(value,np.ndarray):
        value = [value]
    return tf.train.Feature(int64_list=tf.train.Int64List(value=value))
# 生成浮點數(shù)的屬性
def _float_feature(value):
    if not isinstance(value,list) and not isinstance(value,np.ndarray):
        value = [value]
    return tf.train.Feature(float_list=tf.train.FloatList(value=value))
# 生成字符串型的屬性
def _bytes_feature(value):
    if not isinstance(value,list) and not isinstance(value,np.ndarray):
        value = [value]
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=value))

tfrecords的讀取

tfrecords的讀取首先要創(chuàng)建一個reader來讀取TFRecord文件中的Example。

# 創(chuàng)建一個reader來讀取TFRecord文件中的Example
reader = tf.TFRecordReader()

再創(chuàng)建一個隊列來維護輸入文件列表。

# 創(chuàng)建一個隊列來維護輸入文件列表
filename_queue = tf.train.string_input_producer(['record/output.tfrecords'])

利用reader讀取輸入文件列表隊列，并用parse_single_example將讀入的Example解析成tensor

# 從文件中讀出一個Example
_, serialized_example = reader.read(filename_queue)
# 用parse_single_example將讀入的Example解析成tensor
features = tf.parse_single_example(
    serialized_example,
    features={
        'image/class/label': tf.FixedLenFeature([], tf.int64),
        'image/encoded': tf.FixedLenFeature([784], tf.float32, default_value=tf.zeros([784], dtype=tf.float32)),
        'image/encoded_tostring': tf.FixedLenFeature([], tf.string)
    }
)

此時我們得到了一個features，實際上它是一個類似于字典的東西，我們額可以通過字典的方式讀取它內部的內容，而字典的索引就是我們再寫入tfrecord文件時所用的feature。

# 將字符串解析成圖像對應的像素數(shù)組
labels = tf.cast(features['image/class/label'], tf.int32)
images = tf.cast(features['image/encoded'], tf.float32)
images_tostrings = tf.decode_raw(features['image/encoded_tostring'], tf.float32)

最后利用一個循環(huán)輸出：

# 每次運行讀取一個Example。當所有樣例讀取完之后，在此樣例中程序會重頭讀取
for i in range(5):
    label, image = sess.run([labels, images])
    images_tostring = sess.run(images_tostrings)
    print(np.shape(image))
    print(np.shape(images_tostring))
    print(label)
    print("#########################")

測試代碼

1、tfrecords文件的寫入

import numpy as np
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
# 生成整數(shù)的屬性
def _int64_feature(value):
    if not isinstance(value,list) and not isinstance(value,np.ndarray):
        value = [value]
    return tf.train.Feature(int64_list=tf.train.Int64List(value=value))
# 生成浮點數(shù)的屬性
def _float_feature(value):
    if not isinstance(value,list) and not isinstance(value,np.ndarray):
        value = [value]
    return tf.train.Feature(float_list=tf.train.FloatList(value=value))
# 生成字符串型的屬性
def _bytes_feature(value):
    if not isinstance(value,list) and not isinstance(value,np.ndarray):
        value = [value]
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=value))
# 讀取MNIST數(shù)據(jù)集
mnist = input_data.read_data_sets('./MNIST_data', dtype=tf.float32, one_hot=True)
# 獲得image，shape為(55000,784)
images = mnist.train.images
# 獲得label，shape為(55000,10)
labels = mnist.train.labels
# 獲得一共具有多少張圖片
num_examples = mnist.train.num_examples
# 存儲TFRecord文件的地址
filename = 'record/Mnist_Out.tfrecords'
# 創(chuàng)建一個writer來寫TFRecord文件
writer = tf.python_io.TFRecordWriter(filename)
# 將每張圖片都轉為一個Example，并寫入
for i in range(num_examples):
    image_raw = images[i]  # 讀取每一幅圖像
    image_string = images[i].tostring()
    example = tf.train.Example(
        features=tf.train.Features(
            feature={
                'image/class/label': _int64_feature(np.argmax(labels[i])),
                'image/encoded': _float_feature(image_raw),
                'image/encoded_tostring': _bytes_feature(image_string)
            }
        )
    )
    print(i,"/",num_examples)
    writer.write(example.SerializeToString())  # 將Example寫入TFRecord文件
print('data processing success')
writer.close()

運行結果為：

……
54993 / 55000
54994 / 55000
54995 / 55000
54996 / 55000
54997 / 55000
54998 / 55000
54999 / 55000
data processing success

2、tfrecords文件的讀取

import tensorflow as tf
import numpy as np
# 創(chuàng)建一個reader來讀取TFRecord文件中的Example
reader = tf.TFRecordReader()
# 創(chuàng)建一個隊列來維護輸入文件列表
filename_queue = tf.train.string_input_producer(['record/Mnist_Out.tfrecords'])
# 從文件中讀出一個Example
_, serialized_example = reader.read(filename_queue)
# 用parse_single_example將讀入的Example解析成tensor
features = tf.parse_single_example(
    serialized_example,
    features={
        'image/class/label': tf.FixedLenFeature([], tf.int64),
        'image/encoded': tf.FixedLenFeature([784], tf.float32, default_value=tf.zeros([784], dtype=tf.float32)),
        'image/encoded_tostring': tf.FixedLenFeature([], tf.string)
    }
)
# 將字符串解析成圖像對應的像素數(shù)組
labels = tf.cast(features['image/class/label'], tf.int32)
images = tf.cast(features['image/encoded'], tf.float32)
images_tostrings = tf.decode_raw(features['image/encoded_tostring'], tf.float32)
sess = tf.Session()
# 啟動多線程處理輸入數(shù)據(jù)
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
# 每次運行讀取一個Example。當所有樣例讀取完之后，在此樣例中程序會重頭讀取
for i in range(5):
    label, image = sess.run([labels, images])
    images_tostring = sess.run(images_tostrings)
    print(np.shape(image))
    print(np.shape(images_tostring))
    print(label)
    print("#########################")

運行結果為：

#########################
(784,)
(784,)
7
#########################
#########################
(784,)
(784,)
4
#########################
#########################
(784,)
(784,)
1
#########################
#########################
(784,)
(784,)
1
#########################
#########################
(784,)
(784,)
9
#########################

原文鏈接：https://blog.csdn.net/weixin_44791964/article/details/102566358

上一篇：EF?Core項目中不同數(shù)據(jù)庫需要的安裝包介紹_實用技巧
下一篇：python遞歸實現(xiàn)鏈表快速倒轉_python

日本免费高清视频-国产福利视频导航-黄色在线播放国产-天天操天天操天天操天天操|www.shdianci.com

網(wǎng)站首頁編程語言正文

python神經(jīng)網(wǎng)絡tfrecords文件的寫入讀取及內容解析_python

目錄

學習前言

tfrecords格式是什么

tfrecords的寫入

tfrecords的讀取

測試代碼

1、tfrecords文件的寫入

2、tfrecords文件的讀取

相關推薦

日本免费高清视频-国产福利视频导航-黄色在线播放国产-天天操天天操天天操天天操|www.shdianci.com

網(wǎng)站首頁 編程語言 正文

python神經(jīng)網(wǎng)絡tfrecords文件的寫入讀取及內容解析_python

目錄

學習前言

tfrecords格式是什么

tfrecords的寫入

tfrecords的讀取

測試代碼

1、tfrecords文件的寫入

2、tfrecords文件的讀取

相關推薦

網(wǎng)站首頁編程語言正文