CSV 文件中的每行代表电子表格中的一行，逗号分割了该行中的单元格。

本章将使用这个文件作为交互式环境的例子，或在文本编辑器中输入文本，并保存为example.csv。 example.csv 文件内容如下：

In [11]:

!cat /data/demo/demo.csv

序号,URL,TITLE,Abstract,标题,摘要
1,http://drr.ikcest.org/info/9ad53,Spatio-temporal Distribution of Desertification Disaster along the China-Mongolia railway (Mongolia section) in 2000 and 2015,"This dataset described the Spatio-temporal Distribution of Desertification Disaster along the China-Mongolia railway (Mongolia section) in 2000 and 2015, which mainly record the degree of desertification, and spatiotemporal distribution information. They were collected and organized by the Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences. This dataset was composed of 6 vector files and 4 grid files. It can be used in the study of desertification. And it can provide important basis for monitoring and prevention of desertification disaster.",中蒙铁路沿线（蒙古段）荒漠化灾害时空分布数据集（2000、2015）,本数据集为2000、2015年中蒙铁路沿线（蒙古段）荒漠化灾害时空分布数据，其主要记录荒漠化程度及荒漠化时空分布特点，共6个矢量文件和4个栅格文件。它们由中国科学院地理科学与资源研究所收集和组织，其可用于荒漠化研究，为荒漠化灾害监测与防控提供重要依据。
2,http://drr.ikcest.org/info/98d74,"Meteorological resource database of "" Belt and Road"" China-Mongolia-Russia economic corridor","This dataset described the distribution of meteorological resource in China-Mongolia-Russia economic corridor, which mainly record the travel climate comfortable degree in the cross-border region between China and Russia. They were collected and organized by the Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences. This dataset was composed of 56  raster files. It can be used to study meteorological disasters and provide important basis for disaster prevention and reduction and reducing the negative effects of meteorological disasters.",“一带一路”中蒙俄经济走廊气象资源数据,"该数据集描述了中蒙经济走廊的气象资源分布，主要记录了中俄跨境区域的旅游气候舒适度。它们由中国科学院地理科学与自然资源研究所收集和组织。该
数据集由56个光栅文件组成。它可用于研究气象灾害，为防灾减灾和减少气象灾害的负面影响提供重要依据。"
3,http://drr.ikcest.org/info/91462,Dataset of desertification related land cover distribution along China-Mongolia railway (Mongolia section) in 2015,"This dataset was the land cover distribution data related to desertification along the China-Mongolia railway (Mongolia section) in 2015. This dataset used the object-oriented remote sensing image interpretation method to obtain the desertification data with a resolution of 30 meters along the China-Mongolia railway (Mongolia section) in 2015. It was collected and organized by the Institute of Geographic Sciences and Natural Resources Research, CAS. It can be used to study the risk assessment of desertification in China-Mongolia railway, providing an important basis for preventing sandstorms, floods and other disasters caused by desertification and alleviating the negative impact of desertification.",中蒙铁路沿线（蒙古段）荒漠化土地覆被分布数据集（2015）,"该数据集是2015年中蒙铁路（蒙古段）荒漠化相关的土地覆盖分布数据。该数据集采用面向对象的遥感影像解译方法获取2015年中蒙铁路（蒙古段）沿海30米的
荒漠化数据。由中国科学院地理科学与资源研究所收集整理。可用于研究中蒙铁路荒漠化风险评估，为防治沙漠化造成的沙尘暴，洪涝等灾害，减轻荒漠化的负面影响提供重要依据。"

CSV 文件是简单的，缺少Excel电子表格的许多功能。例如， CSV 文件中：

值没有类型，所有东西都是字符串；
没有字体大小或颜色的设置；
没有多个工作表；
不能指定单元格的宽度和高度；
不能合并单元格；
不能嵌入图像或图表。

CSV的文件的优势是简单。CSV文件被许多种类的程序广泛地支持，可以在文本编辑器中查看（包括 IDLE的文件编辑器)，它是表示电子表格数据的直接方式。 CSV 格式和它声称的完全一致：它就是一个文本文件，具有逗号分隔的值。

因为CSV文件就是文本文件，所以可能会尝试将它们读入一个字符串，然后处理这个字符串。例如，因为 CSV 文件中的每个单元格有逗号分割，也许可以只是对每行文本调用 split()方法，来取得这些值。但并非 CSV 文件中的每个逗号，都表示两个单元格之间的分界。 CSV 文件也有自己的转义字符，允许逗号和其他字符作为值的一部分。 split() 方法不能处理这些转义字符。因为这些潜在的缺陷，所以在 Python 中应该使用 csv 模块来读写CSV文件。

`Reader` 对象

要用 csv 模块从 CSV 文件中读取数据，需要创建一个 Reader 对象。 Reader 对象迭代遍历 CSV 文件中的每一行。在交互式环境中输入以下代码，同时将 example.csv 放在当前工作目录中：

In [12]:

import csv
exampleFile = open('/data/demo/demo.csv')
exampleReader = csv.reader(exampleFile)
exampleData = list(exampleReader)
exampleData

Out[12]:

[['序号', 'URL', 'TITLE', 'Abstract', '标题', '摘要'],
 ['1',
  'http://drr.ikcest.org/info/9ad53',
  'Spatio-temporal Distribution of Desertification Disaster along the China-Mongolia railway (Mongolia section) in 2000 and 2015',
  'This dataset described the Spatio-temporal Distribution of Desertification Disaster along the China-Mongolia railway (Mongolia section) in 2000 and 2015, which mainly record the degree of desertification, and spatiotemporal distribution information. They were collected and organized by the Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences. This dataset was composed of 6 vector files and 4 grid files. It can be used in the study of desertification. And it can provide important basis for monitoring and prevention of desertification disaster.',
  '中蒙铁路沿线（蒙古段）荒漠化灾害时空分布数据集（2000、2015）',
  '本数据集为2000、2015年中蒙铁路沿线（蒙古段）荒漠化灾害时空分布数据，其主要记录荒漠化程度及荒漠化时空分布特点，共6个矢量文件和4个栅格文件。它们由中国科学院地理科学与资源研究所收集和组织，其可用于荒漠化研究，为荒漠化灾害监测与防控提供重要依据。'],
 ['2',
  'http://drr.ikcest.org/info/98d74',
  'Meteorological resource database of " Belt and Road" China-Mongolia-Russia economic corridor',
  'This dataset described the distribution of meteorological resource in China-Mongolia-Russia economic corridor, which mainly record the travel climate comfortable degree in the cross-border region between China and Russia. They were collected and organized by the Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences. This dataset was composed of 56  raster files. It can be used to study meteorological disasters and provide important basis for disaster prevention and reduction and reducing the negative effects of meteorological disasters.',
  '“一带一路”中蒙俄经济走廊气象资源数据',
  '该数据集描述了中蒙经济走廊的气象资源分布，主要记录了中俄跨境区域的旅游气候舒适度。它们由中国科学院地理科学与自然资源研究所收集和组织。该\n数据集由56个光栅文件组成。它可用于研究气象灾害，为防灾减灾和减少气象灾害的负面影响提供重要依据。'],
 ['3',
  'http://drr.ikcest.org/info/91462',
  'Dataset of desertification related land cover distribution along China-Mongolia railway (Mongolia section) in 2015',
  'This dataset was the land cover distribution data related to desertification along the China-Mongolia railway (Mongolia section) in 2015. This dataset used the object-oriented remote sensing image interpretation method to obtain the desertification data with a resolution of 30 meters along the China-Mongolia railway (Mongolia section) in 2015. It was collected and organized by the Institute of Geographic Sciences and Natural Resources Research, CAS. It can be used to study the risk assessment of desertification in China-Mongolia railway, providing an important basis for preventing sandstorms, floods and other disasters caused by desertification and alleviating the negative impact of desertification.',
  '中蒙铁路沿线（蒙古段）荒漠化土地覆被分布数据集（2015）',
  '该数据集是2015年中蒙铁路（蒙古段）荒漠化相关的土地覆盖分布数据。该数据集采用面向对象的遥感影像解译方法获取2015年中蒙铁路（蒙古段）沿海30米的\n荒漠化数据。由中国科学院地理科学与资源研究所收集整理。可用于研究中蒙铁路荒漠化风险评估，为防治沙漠化造成的沙尘暴，洪涝等灾害，减轻荒漠化的负面影响提供重要依据。']]

csv 模块是Python自带的，所以不需要安装就可以导入它。

要用 csv 模块读取CSV文件，首先用 open() 函数打开它，就像打开任何其他文本文件一样。但是，不用在 open() 返回的 File 对象上调用 read() 或 readlines() 方法，而是将它传递给 csv.reader() 函数。这将返回一个 Reader 对象，供使用。请注意，不能直接将文件名字符串传递给 csv.reader() 函数。

要访问 Reader 对象中的值，最直接的方法，就是将它转换成一个普通Python列表，即将它传递给 list()。在这个 Reader 对象上应用 list()函数，将返回一个列表的列表。可以将它保存在变量exampleData 中。在交互式环境中输入exampleData ，将显示列表的列表。

既然已经将 CSV 文件表示为列表的列表，就可以用表达式 exampleData[row][col] 来访问特定行和列的值。其中， row 是 exampleData 中一个列表的下标， col 是该列表中想访问项的下标。在交互式环境中输入以下代码：

In [13]:

exampleData[0][0]

Out[13]:

'序号'

In [14]:

exampleData[0][1]

Out[14]:

'URL'

In [15]:

exampleData[0][2]

Out[15]:

'TITLE'

In [16]:

exampleData[1][1]

Out[16]:

'http://drr.ikcest.org/info/9ad53'

exampleData[0][0]进入第一个列表，并给出第一个字符串。 exampleData[0][2]进入第一个列表，并给出第三个字符串，以此类推。

`for` 循环中的 `Reader` 对象

在 for 循环中，从 Reader 对象读取数据。对于大型的CSV文件，需要在一个 for 循环中使用 Reader 对象。这样避免将整个文件一次性装入内存。例如，在交互式环境中输入以下代码：

In [17]:

import csv
exampleFile = open ('/data/demo/demo.csv')
exampleReader = csv.reader(exampleFile)
for row in exampleReader:
    print('Row #' + str(exampleReader.line_num) + ' ' + str(row))

Row #1 ['序号', 'URL', 'TITLE', 'Abstract', '标题', '摘要']
Row #2 ['1', 'http://drr.ikcest.org/info/9ad53', 'Spatio-temporal Distribution of Desertification Disaster along the China-Mongolia railway (Mongolia section) in 2000 and 2015', 'This dataset described the Spatio-temporal Distribution of Desertification Disaster along the China-Mongolia railway (Mongolia section) in 2000 and 2015, which mainly record the degree of desertification, and spatiotemporal distribution information. They were collected and organized by the Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences. This dataset was composed of 6 vector files and 4 grid files. It can be used in the study of desertification. And it can provide important basis for monitoring and prevention of desertification disaster.', '中蒙铁路沿线（蒙古段）荒漠化灾害时空分布数据集（2000、2015）', '本数据集为2000、2015年中蒙铁路沿线（蒙古段）荒漠化灾害时空分布数据，其主要记录荒漠化程度及荒漠化时空分布特点，共6个矢量文件和4个栅格文件。它们由中国科学院地理科学与资源研究所收集和组织，其可用于荒漠化研究，为荒漠化灾害监测与防控提供重要依据。']
Row #4 ['2', 'http://drr.ikcest.org/info/98d74', 'Meteorological resource database of " Belt and Road" China-Mongolia-Russia economic corridor', 'This dataset described the distribution of meteorological resource in China-Mongolia-Russia economic corridor, which mainly record the travel climate comfortable degree in the cross-border region between China and Russia. They were collected and organized by the Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences. This dataset was composed of 56  raster files. It can be used to study meteorological disasters and provide important basis for disaster prevention and reduction and reducing the negative effects of meteorological disasters.', '“一带一路”中蒙俄经济走廊气象资源数据', '该数据集描述了中蒙经济走廊的气象资源分布，主要记录了中俄跨境区域的旅游气候舒适度。它们由中国科学院地理科学与自然资源研究所收集和组织。该\n数据集由56个光栅文件组成。它可用于研究气象灾害，为防灾减灾和减少气象灾害的负面影响提供重要依据。']
Row #6 ['3', 'http://drr.ikcest.org/info/91462', 'Dataset of desertification related land cover distribution along China-Mongolia railway (Mongolia section) in 2015', 'This dataset was the land cover distribution data related to desertification along the China-Mongolia railway (Mongolia section) in 2015. This dataset used the object-oriented remote sensing image interpretation method to obtain the desertification data with a resolution of 30 meters along the China-Mongolia railway (Mongolia section) in 2015. It was collected and organized by the Institute of Geographic Sciences and Natural Resources Research, CAS. It can be used to study the risk assessment of desertification in China-Mongolia railway, providing an important basis for preventing sandstorms, floods and other disasters caused by desertification and alleviating the negative impact of desertification.', '中蒙铁路沿线（蒙古段）荒漠化土地覆被分布数据集（2015）', '该数据集是2015年中蒙铁路（蒙古段）荒漠化相关的土地覆盖分布数据。该数据集采用面向对象的遥感影像解译方法获取2015年中蒙铁路（蒙古段）沿海30米的\n荒漠化数据。由中国科学院地理科学与资源研究所收集整理。可用于研究中蒙铁路荒漠化风险评估，为防治沙漠化造成的沙尘暴，洪涝等灾害，减轻荒漠化的负面影响提供重要依据。']

在导入 csv 模块，并从CSV文件得到 Reader 对象之后，可以循环遍历 Reader 对象中的行。每一行是一个值的列表，每个值表示一个单元格。print() 函数将打印出当前行的编号以及该行的内容。要取得行号，就使用 Reader 对象的 line_num 变量，它包含了当前行的编号。

Reader 对象只能循环遍历一次。要再次读取CSV文件，必须调用 csv.reader ，创建一个对象。

`Writer` 对象

Writer 对象将数据写入CSV文件。要创建一个 Writer 对象，就使用csv.writer() 函数。在交互式环境中输入以下代码。

In [18]:

import csv
outputFile = open('xx_output.csv', 'w', newline='')
outputWriter = csv.writer(outputFile)
outputWriter.writerow(['spam', 'eggs', 'bacon', 'ham'])

Out[18]:

In [19]:

outputWriter.writerow(['Hello, world!', 'eggs', 'bacon', 'ham'])

Out[19]:

In [20]:

outputWriter.writerow([1, 2, 3.141592, 4])

Out[20]:

In [21]:

outputFile.close()

首先，调用 open() 并传入 'w'，以写模式打开一个文件。这将创建对象。然后将它传递给 csv.writer()，创建一个Writer 对象。

在Windows上，需要为 open() 函数的 newline 关键字参数传入一个空字符串。这样做的技术原因超出了本书的范围。如果忘记设置 newline 关键字参数， output.csv 中的行距将有两倍。

Writer 对象的 writerow() 方法接受一个列表参数。列表中的每个词，放在输出的CSV文件中的一个单元格中。 writerow() 函数的返回值，是写入文件中这一行的字符数（包括换行字符）。

这段代码生成的文件像下面这样：

spam,eggs,bacon,ham
"Hello, world!",eggs,bacon,ham
1,2,3.141592,4

请注意，Writer 对象自动转义'Hello, world!' 中的逗号，在CSV文件中使用了双引号。模块 CSV 不必自己处理这些特殊情况。

`delimiter` 和 `lineterminator` 关键字参数

假定希望用制表符代替逗号来分隔单元格，并希望有两倍行距。可以在交互式环境中输入下面这样的代码：

In [22]:

import csv
csvFile = open('xx_example.tsv', 'w', newline='')
csvWriter = csv.writer(csvFile, delimiter='\t', lineterminator='\n\n')
csvWriter.writerow(['apples', 'oranges', 'grapes'])

Out[22]:

In [23]:

csvWriter.writerow(['eggs', 'bacon', 'ham'])

Out[23]:

In [24]:

csvWriter.writerow(['spam', 'spam', 'spam', 'spam', 'spam', 'spam'])

Out[24]:

In [25]:

csvFile.close()

这改变了文件中的分隔符和行终止字符。分隔符是一行中单元格之间出现的字符。默认情况下， CSV文件的分隔符是逗号。行终止字符是出现在行末的字符。默认情况下，行终止字符是换行符。可以利用 csv.writer() 的 delimiter 和 lineterminator关键字参数，将这些字符改成不同的值。

传入 delimeter='\t' 和 lineterminator='\n\n' ，这将单元格之间的字符改变为制表符，将行之间的字符改变为两个换行符。调用 writerow() 三次，得到3行。这产生了文件 example.tsv ，包含以下内容：

In [26]:

!more xx_example.tsv

apples oranges grapes

eggs    bacon   ham

spam    spam    spam    spam    spam    spam

既然单元格是由制表符分隔的，就使用文件扩展名.tsv,表示制表符分隔的值。

`Reader` 对象

`for` 循环中的 `Reader` 对象

`Writer` 对象

`delimiter` 和 `lineterminator` 关键字参数

① 阅读使用手册

② 注册用户账号

介绍

平台内核

注意事项

Reader 对象

for 循环中的 Reader 对象

Writer 对象

delimiter 和 lineterminator 关键字参数

① 阅读使用手册

② 注册用户账号

③ 登陆

Python基础

Python进阶

标准类库

专题工具

图像处理

科学计算

自然语言

开源GIS

R 编程语言

Julia编程语言

介绍

平台内核

注意事项

`Reader` 对象

`for` 循环中的 `Reader` 对象

`Writer` 对象

`delimiter` 和 `lineterminator` 关键字参数