实用Python程序设计MOOC-第八章文件读写和文件夹操作和数据库

[TOC]

实用Python程序设计MOOC-第八章文件读写和文件夹操作和数据库

文本文件读写

文本文件读写概述

open函数打开文件，将返回值放入一个变量，例如f
用f.write函数写入文件
用f.readlines函数读取全部文件内容
用f.readline函数读取文件- 行
用f.close()函数关闭文件
用f.read()读取全部文件内容。返回一个字符串，包含文件全部内容

创建文件并写入内容

a = open("c:\\tmp\\t.txt", "w")	#文件夹c:\tmp必须事先存在
#"w"表示写入,用此种方式打开文件,若文件本来存在,就会被覆盖
a.write("good\n")
a.write("好啊\n")
a.close()

运行后文件c:\tmp\t.txt内容：
good
好啊

文本文件读写

读取现有文件

f = open("c:\\tmp\\t.txt", "r")	#"r"表示读取
lines = f.readlines() #每一行都带结尾的换行符"\n"
f.close()	#lines是个字符串列表，每个元素是一行
for x in lines:
	print(x, end="")

输出：
good
好啊

读取现有文件

#不用readlines也行
f = open("C:\\tmp\\t.txt", "r", encoding="utf-8")
for x in f:
	print(x, end="")
f.close()

用readline读文件中的一行

infile = open("c:\\tmp\\t.txt", "r")

while True:
	data1 = infile.readline() #data1带结尾的换行符"\n"。空行也有一个字符，就是"\n"
	if data1 == "":	#此条件满足就代表文件结束
		break
	data1 = data1.strip() #去掉两头空格，包括结尾的"\n"
	print(data1)

infile.close()

如果要读取的文件不存在会引发异常

try:
	f = open("c:\\tmp\\ts.txt", "r")	#若文件不存在，会产生异常，跳到except后面执行
	lines = f.readlines()
	f.close()
	for x in lines:
		print(x, end="")
except Exception as e:
	print(e)	#>> [Errno 2] No such file or directory: 'c:\\tmp\\ts. txt '

添加文件内容

f = open("c:\\tmp\\t.txt", "a")	#"a"要打开文件添加内容。若文件本来不存在，就创建文件
f.write("新增行\n")
f.write("ok\n")
f.close()

good
好啊
新增行
ok

文本文件的编码

常见编码有gbk和utf-8两种。打开文件时如果编码不对，则不能正确读取文件
ANSI对应gbk
写入文件时，如果不指定编码，则用操作系统的缺省编码

Windows：gbk，可能从win10开始是utf-8.
Linux，MacOs：utf-8

python程序的编码

py文件必须存成utf-8格式，才能运行如果存成ansi格式，则应该在文件开头写:

1 2	#coding=gbk print("你好")

创建文件和读取文件时都可以指定编码

outfile = open("C:\\tmp\\t.txt", "w", encoding="utf-8")
#若打开文件用于写入时不指定编码，则使用系统缺省编码，win10下也可能是Ansi(gbk)
outfile.write("这很好ok\n")
outfile.write("这ok")
outfile.close()

infile = open("c:\\tmp\\t.txt", "r", encoding="utf-8")
lines = infile.readlines()
infile.close()
for x in lines:
	print(x.strip())

文件的路径

open文件名参数的相对路径形式和绝对路径形式

相对路径形式:文件名没有包含盘符

open("readme.txt", "r")
	#文件在当前文件夹(当前路径)下
open("tmp/readme.txt", "r")
	#"/"写成"\\"效果也一样
	#文件在当前文件夹下的tmp文件夹里面
open("tmp/test/readme.txt", "r")
	#文件在当前文件夹下的tmp文件夹里面的test文件夹下面
open("../readme.txt", "r")
	#文件在当前文件夹的上一层文件夹里面
open("../../readme.txt", "r")
	#文件在当前文件夹的上两层文件夹里面
open("../tmp2/test/readme.txt", "r")
	#文件在当前文件夹的上一层的tmp2文件夹的test文件夹里面
	#tmp2和当前文件夹是平级的
open("/tmp3/test/readme.txt", "r")
	#文件在当前盘符的根文件夹下的tmp3/test/里面

绝对路径形式:文件名包含盘符

1 2	open("d:/tmp/test/readme.txt", "r") 路径也叫文件夹，或者目录(path, folder, directory)

Python程序的“当前文件夹(当前路径，当前目录)

程序运行时，会有一个”当前文件夹”，open打开文件时，如果文件名不是绝对路径形式，则都是相对于当前文件夹的。
一般情况下，.py文件所在的文件夹，就是程序运行时的当前文件夹。在Pycharm里面运行程序，就是如此。
程序可以获取当前文件夹:

1
2
3

import os
print(os.getcwd())	#os.getcdw()获取当前文件夹
#>>c:\tmp5\test

在命令行方式运行程序时，cmd窗口的当前文件夹，就是程序的当前文件夹，不论程序存在哪里。

1
2
3

c:\tmp5\test\t1.py如下:
import os
print(os.getcwd())

标题：C:\WINDOWS\system32\cmd.exe

C:\music\violin>python c:\tmp5\test\tl.py
C:\music\violin

C:\music\violin>

程序运行期间可以改变当前文件夹

c:\tmp5\test\t1.py如下：

import os
print(os.getcwd())
os.chdir("c:/tmp")
print(os.getcwd())

则:

标题：C:WINDOWS\system32\cmd.exe
C:\music\violin>python c:\tmp5\test\t1.py
C:\music\violin
C:\tmp

C:\music\violin>

文件夹操作

Python的文件夹操作函数

os库和shutil库

os库和shutil库中有一些函数可以用来操作文件和文件夹(文件夹也称为“目录”)

函数名称	函数作用
`os.chdir(x)`	将程序的当前文件夹设置为x
`os.getcwd()`	求程序的当前文件夹
`os.listdir(x)`	返回一个列表，里面是文件夹x中的所有文件和子文件夹的名字
`os.mkdir(x)`	创建文件夹x
`os.path.getsize(x)`	获取文件x的大小(单位:字节)
`os.path.isfile(x)`	判断x是不是文件
`os.remove(x)`	删除文件x
`os.rmdir(x)`	删除文件夹x。x必须是空文件夹才能删除成功
`os.rename(x,y)`	将文件或文件夹x改名为y。不但可以改名,还可以起到移动文件或文件夹的作用。例如os.rename(“c:/tmp/a”, “c:/tmp2/b”)可以将文件夹或文件” c:/tmp/a”移动到”c:/tmp2/“文件夹下面,并改名为b。前提是tmp2必须存在。
`shutil.copyfile(x,y)`	拷贝文件x到文件y。若y本来就存在，会被覆盖

删除文件夹的递归函数

（删除文件夹没有办法恢复）

import os
def powerRmDir(path): # 连根删除文件夹path
	lst = os.listdir(path)
	for x in lst:
		actualFileName = path + "/" + x		#x不包括路径,例如a. txt
		if os.path.isfile(actualFileName): #actualFileName是文件
			os.remove(actualFileName)
		else:
			powerRmDir(actualFileName) #actualFileName是文件夹
	os.rmdir(path)

powerRmDir("c:/tmp/ttt")
powerRmDir("tmp/ttt")	#删除当前文件夹下的tmp文件夹下的ttt文件夹

获取文件夹总大小的递归函数

import os
def getTotalSize(path):
	total = 0
	lst = os.listdir(path)
	for x in lst:
		actualFileName = path + "/" + x	#x不包括路径
		if OS.path.isfile(actualFileName):
			total += OS.path.getsize(actualEileName)
		else:
			total += getTotalSize(actualFileName)
	return total

命令行参数

以命令行方式运行python程序

每次运行Python程序，都要从Pycharm里运行，显然不方便。

因此有时需要以命令行方式(命令脚本方式)运行python程序

具体做法：

在命令行窗口(mac叫“终端”)敲:

1 2	python xxx.py 就能运行xxx.py

Windows下，Win+R键，可以弹出左边”运行”窗口，敲”cmd”确定，就能弹出右边cmd窗口(命令行窗口)

Mac上相应操作，是从launchPad里面启动”终端”

命令行参数

如果编写了一-个程序hello.py，功能是合并两个文件
希望在命令行敲

1	python hello.py a1.txt a2.txt

就能完成把a2.txt合并到a1.txt上面。
hello.py运行时，如何知道要处理的文件是a1.txt和a2.txt呢?
a1.txt，a2.txt都是”命令行参数”。因此程序内应该有获得命令行参数的方法

1
2
3

import sys
for x in sys.argv:
	print(x)

在命令行窗口以如下方式运行该程序，假设程序存为hello.py：

1	python hello.py this is "hello world"

输出结果：

hello.py
this
is
hello world

则在程序中

sys.argv[0]就是 'hello.py'
sys.argv[1]就是 'this'
sys.argv[2]就是 'is'
sys.argv[3]就是 'hello world'

程序以命令行运行时的当前文件夹

程序以命令行方式启动时，当前文件夹就是命令提示符表示的文件夹，而不是python程序文件所在的文件夹。

1
2
3

C:/tmp5/test/t.py
import os
print(os.getcwd())

标题：选择C:\WINDOWS\system32\cmd.exe

C:\diskd>python c:\tmp5\test\t.py
C:\diskd

C:\diskd>

文件处理实例

程序1:统计文章中的单词词频

程序名: countfile.py

用命令行方式启动该程序：
python countfile.py 源文件结果文件

例如:
python countfile.py a1.txt r1.txt
python countfile.py c:\tmp\a4.txt d:\tmp\r4.txt

对”源文件”进行单词词频(出现次数)分析，分析结果写入”结果文件”，单词按照字典序排列

文章文件a1.txt的格式:

1
2
3

When many couples decide to expand their family，they often take into consideration the different genetic traits that they may pass on to their children. For example, if someone has a history of heart problems, they might be concerned about passing that on to their chi Idren as well.
Treacher Collins syndrome, or TCS, ?is a rare facial disfigurement that
greatly: slows: the deve lopment of bones and other t issues that make up the human face. As a result, most people living?with TCS have?under developed cheek bones, a small jaw, and an undersized chin.

统计的结果结果文件r1.txt格式

a	8
about	2
an	1
and	4
are	1
around	1
as	2
backlash	1
be	4

思路

1)命令行参数sys.argv[1]就是源文件，sys.argv[2]就是结果文件。
2)要从a1.txt中分割出单词，然后用字典记录单词的出现频率。
3)分割单词时的分隔字符多种多样，因此要统计a1.txt中出现了哪些非字母的字符，非字母的字符都是分隔串。
4)要用re.split()来分割。

回顾:通过正则表达式用多个分隔串进行分割

1 2	import re re.split(x,s)

用正则表达式x里面的分隔串分割s
x里面不同分隔串用”|”隔开，形如:
';||, | \* |\n| \? |ok|8'
一些特殊字符，比如:? ! "'()|*$\[]^{}. ,
在正则表达式里出现时，前面需要加\

import re
a = 'Beautiful, is; beoktter*than\nugly'
print(re.split(';| |,|\*|\n|ok',a)) #分隔串用 | 隔开]

#';' ' ' ',' '*' '\n' 'ok'都被看作分隔串
#>>['Beautiful', ' ', 'is', '', 'be', 'tter', 'than', 'ugly']
#两个相邻的分隔串之间，会隔出一个空串

import sys
import re

def countFile(filename, words):  # 对filename文件进行词频分析，分析结果记在词典words里
    try:
        f = open(filename, "r", encoding="gbk")	#文件为缺省编码。根据实际情况可以加参数 encoding="utf-8" 或 encoding = "gbk"
    except Exception as e:
        print(e)
        return 0

    txt = f.read()	#全部文件内容存入字符串txt
    f.close()
    splitChars = set([])  #分割串的集合
    # 下面找出既有文性中非宝母的字符，作为分服串
    for c in txt:
        if not (c >= 'a' and c <= 'z' or c >= 'A' and c <= 'Z'):
            splitChars.add(c)
    splitStr = ""  # 用无 re.split的正则表达式
    # 该正则表达式形式类似于:"，|：| |-"之类两个紧线之间的字符串就是分隔符
    for c in splitChars:
        if c in ['.', '?', '!', '"', "", '(', ')', 'l', '*', '$', '\l', '[', 'l', '"', '{', '}']:
            # 上面这些字符比较特殊，加到splitChars 里面的时候要在前面加“"\,”
            splitStr += "\\" + c + "|"  # python字符中里面。\\其实就是\ )
        else:
            splitStr += c + "|"
        splitStr += ""  # '|'后面必须要有东西，空格多写一遍没关系
        lst = re.split(splitStr, txt)  # lst是分隔后的单词列表
        for x in lst:
            if x == "":  # 两个相部分领患之间会分割出来一个空患。 不理它
                continue
        lx = x.lower()
        if lx in words:
            words[lx] += 1	#如果在词典里，则该词出现次数+1
        else:
            words[lx] = 1	#如果不在词典里，则该词加入词典，出现次数为1
    return 1

result = {}  # 结果宝典。 格式为{'a':2, 'about':3, ...}
if countFile(sys.argv[1], result) == 0:  # argv[1]是源文件,..分析績果记在result.黑面
    exit()
lst = list(result.items())
lst.sort()  # 单词按字典序排庄
f = open(sys.argv[2], "w")  #argv[2] 是结果文件，文件为缺省编码， "w"表示写入
for x in lst:
    f.write("%s\t%d\n" % (x[0], x[1]))
f.close()

程序2:统计多个文件累计单词频率

程序名countfiles.py
用法
1
python countfiles.py结果文件
例如
1
python countfiles.py result.txt

对当前文件夹(countfiles.py文件所在文件夹)下全部文件名是字母a打头的.txt文件进行词频统计，统计的总的结果写入”结果文件”result.txt。

思路

要获得.py程序所在文件夹下的所有a打头，. txt结尾的文件。对每个文件，调用上面
程序1的处理单个文件的函数进行处理

1
2
3

import os	#ython自带os库
os.listdir()	#可以获得当前文件夹下所有文件和文件夹的列表。列表中元素是文件或文件夹名字，不带路径(目录)
os.path.isfile(x)	#可以判断x是不是一个文件(文件夹不是文件)

os.listdir示例

假设c:\tmp文件夹下有文件t.py，a.txt，b.txt和文件夹hello

程序t.py如下:

1 2	import os print(os.listdir())

则运行t.py输出结果为:
['a.txt', 'b.txt', 'hello', 't.py']

实现

result = {}
lst = os.listdir()	#列出当前文件夹下所有文件和文件夹的名字
for x in lst:
	if os.path.isfile(x) : #如果x是文件
		if x.lower().endswith(".txt") and x.lower().startswith("a"):	#x是'a'开头, .txt结尾
			countFile(x, result) #countFile是程序1中统计一个文件的函数

程序3:准确统计文章中的单词词频

程序名：countfile_novary.py
用法：

python countfile_novary.py 源文件结果文件

对”源文件”进行单词词频分析，分析结果写入”结果文件”如果碰到单词的变化形式，则转换成原型再统计

单词原型-变化词汇表在文件word_varys.txt里面，格式：

act
	acted|acting|acts
action
	actions
active
	actively|activeness

思路

1)同样需要一个字典来统计单词及其出现次数。

2)读取word_varys.txt文件，构造一个字典dt。元素形式为:{acted:act, acting:act, acts:act, actions:action, ...}键是单词的变化形式，值是单词的原型。

3)对每个”源文件”里的单词w，查找dt中键为w的元素x。如果x不存在，则w就是原型，统计其词频。如果x存在，则值x[1]是原型，将x[1]的出现次数加1。

实现

import sys
import re


def makeVaryWordsDict():
    vary_words = {}  # 元素形式： 变化形式：原型 例如{acts:act,acting:act,boys:boy....}
    f = open("word_varys.txt", "r", encoding="gbk")
    lines = f.readlines()
    f.close()
    L = len(lines)
    for i in range(0, L, 2):  # 每两行是一个单词的原型及变化形式
        word = lines[i].strip()  # 单词原型
        varys = lines[i + 1].strip().split("|")  # 变形
        for w in varys:
            vary_words[w] = word  # 加入变化形式：原型 , w的原型是 word
    return vary_words


def makeSplitStr(txt):
    splitChars = set([])
    # 下面找出所有文件中非字母的字符，作为分隔符
    for c in txt:
        if not (c >= 'a' and c <= 'z' or c >= 'A' and c <= 'Z'):
            splitChars.add(c)
    splitStr = ""
    # 生成用于 re.split的分隔符字符串
    for c in splitChars:
        if c in ['.', '?', '!', '"', "'", '(', ')', '|', '*', '$', '\\', '[', ']', '^', '{', '}']:
            splitStr += "\\" + c + "|"
        else:
            splitStr += c + "|"
    splitStr += " "
    return splitStr


def countFile(filename, vary_word_dict):
    # 分析 filename 文件，返回一个词典作为结果。到 vary_word_dict里查单词原型
    try:
        f = open(filename, "r", encoding="gbk")
    except Exception as e:
        print(e)
        return None
    txt = f.read()
    f.close()
    splitStr = makeSplitStr(txt)
    words = {}
    lst = re.split(splitStr, txt)
    for x in lst:
        lx = x.lower()
        if lx == "":
            continue
        if lx in vary_word_dict:  # 如果在原型词典里能查到原型，就变成原型再统计
            lx = vary_word_dict[lx]
        # 直接写这句可以替换上面 if 语句  lx = vary_word_dict.get(lx,lx)
        words[lx] = words.get(lx, 0) + 1
    return words


result = countFile(sys.argv[1], makeVaryWordsDict())
if result != None and result != {}:
    lst = list(result.items())
    lst.sort()
    f = open(sys.argv[2], "w", encoding="gbk")
    for x in lst:
        f.write("%s\t%d\n" % (x[0], x[1]))
    f.close()

程序4:`countfile_nocet4.py`

用法

python countfile_nocet4.py 源文件结果文件

对”源文件”进行单词词频分析，只抽取不在四级单词列表中的单词，将分析结果写入”结果文件”

四级单词列表在文件cet4words.txt中，单词都是单独一行，以$打头

$abandon
[?'b?nd?n]
vt.遗弃；放弃；放纵(自己)
$ability
[?'b?l?t?]
n.能力，才能
$able
['e?bl]
a.有的能力；有本事的，能干的
$aboard
[?'b?:d]
ad.&prep.在船(飞机、车)上；ad.上船(飞机)

思路

读取cet4words.txt中的单词，存放到一个集合里面。碰到源文件里的单词，先查查在不在集合里面，如果在，则抛弃。

代码

import sys
import re


def makeFilterSet():
    cet4words = set([])
    f = open("cet4words.txt", "r", encoding="gbk")
    lines = f.readlines()
    f.close()
    for line in lines:
        line = line.strip()
        if line == "":
            continue
        if line[0] == "$":
            cet4words.add(line[1:])  # 将四级单词加入集合
    return cet4words


def makeSplitStr(txt):
    splitChars = set([])
    # 下面找出所有文件中非字母的字符，作为分隔符
    for c in txt:
        if not (c >= 'a' and c <= 'z' or c >= 'A' and c <= 'Z'):
            splitChars.add(c)
    splitStr = ""
    # 生成用于 re.split的分隔符字符串
    for c in splitChars:
        if c in ['.', '?', '!', '"', "'", '(', ')', '|', '*', '$', '\\', '[', ']', '^', '{', '}']:
            splitStr += "\\" + c + "|"
        else:
            splitStr += c + "|"
    splitStr += " "
    return splitStr


def countFile(filename, filterdict):  # 词频统计，要去掉在 filterdict集合里的单词
    words = {}
    try:
        f = open(filename, "r", encoding="gbk")
    except Exception as e:
        print(e)
        return 0
    txt = f.read()
    f.close()
    splitStr = makeSplitStr(txt)
    lst = re.split(splitStr, txt)
    for x in lst:
        lx = x.lower()
        if lx == "" or lx in filterdict:  # 去掉在 filterdict里的单词
            continue
        words[lx] = words.get(lx, 0) + 1
    return words


result = countFile(sys.argv[1], makeFilterSet())
if result != {}:
    lst = list(result.items())
    lst.sort()
    f = open(sys.argv[2], "w", encoding="gbk")
    for x in lst:
        f.write("%s\t%d\n" % (x[0], x[1]))
    f.close()

数据库和SQL语言

数据库的概念

数据库可以用来存放大量数据，并且提供了方便的快速检索手段
便于快速找出符合某种条件的数据。比如：工作年限超过三年，工资超过10000元的北京籍员工
一个数据库可以是一个文件，比如c:/tmp/students.db

数据库中的表

一个数据库文件里可以有多张表。比如students.db里包含”学生信息表”和”课程信息表”
表由记录构成，比如”学生信息表”里的每个记录，代表一个学生的信息。
记录由字段构成，描述一个事物的多个属性。比如学生记录，可以由name, id, age, gender, gpaa 等字段构成

字段

字段是有”类型”的。比如

字段名	数据类型
name	text	字符串
gpa	real	小数
age	integer	整数
profile	text
photo	blob	二进制数据(如图片)
birthday	date	日期(本质上就是text)
registertime	datetime	日期+时间(本质上就是text)

SQL数据库查询语句

参考: http://www.w3school.com.cn/sql/sql_syntax.asp

SQL命令:用于进行数据库操作的标准语句

命令	解释
CREATE TABLE	创建表
INSERT INTO	在表中插入记录
UPDATE	在表中更新记录
SELECT	在表中进行查询
DELETE	从表中删除记录

不区分大小写

CREATE

1	CREATE TABLE if not exists students (id integer primary key, name text, gpa real, birthday date, age integer, picture blob)

创建了一张名为students的表，有以下字段:

字段名	数据类型
id	integer	primary key表示不可重复
name	text	字符串
gpa	real	小数
birthday	date	日期(本质上就是text)
age	integer	整数
picture	blob	二进制数据(如图片)

INSERT

1	INSERT INTO students VALUES(1000, '张三', 3.81, '2000-09-12', 18, null)

在表students中插入一个记录，该记录暂无照片(nulI)

VALUES(每个字段的值)

创建数据库

创建数据库并写入数据

import sqlite3

db = sqlite3.connect("c: /tmp/test2.db")  # 连接数据库， 若不存在则自动创建
# 文件夹c:/tmp 必须事先存在, connect不会创建文件夹
cur = db.cursor()  # 获取光标，要操作数据库一般要通过光标进行

sql = '''CREATE TABLE if not exists students (id integer primary key ,
name text, gpa real, birthday date, age integer, picture blob) '''  # 如果表students不存在就创建它
cur.execute(sql)  # 执行sQL命令
cur.execute("insert into students values (1600, '张三', 3.81, '2000-09-12', 18, null)")  # 插入一个记录

mylist = [(1700, '李四', "3.25", '2001-12-01', 17, None), (1800, '王五', "3.35", '1999-01-01', 19, None)]
for s in mylist:  # 依次插入mylist中的每个记录
    cur.execute('INSERT INTO students VALUES(?,?,?,?,?,?) ', (s[0], s[1], s[2], s[3], s[4], s[5]))  # ?对应于后面某项
db.commit()  # 真正写入，写入操作都需要这个
cur.close()  # 关闭光标
db.close()  # 关闭数据库

数据库查询和修改

SELECT

SELECT * FROM students
	#检索students表中全部记录
SELECT * FROM students ORDER BY age
	#检索students表中全部记录，并按年龄排序
SELECT name, age FROM students
	#检索students表中全部记录，但每个记录只取name和age字段
SELECT * FROM students WHERE name = '张三'
	#检索students表中全部name字段为张三的记录,WHERE表示检索条件
SELECT * FROM students WHERE name= '张三' AND age > 20 ORDER BY age DESC
	#检索students表中全部名为张三且年龄大于20的人，结果按年龄降序排列

检索数据库

import sqlite3
db = sqlite3.connect("C:/tmp/test2.db")
cur = db.cursor()
sql = 'select * from students' #检索全部记录
cur.execute(sql)
x = cur.fetchone() #fetchone取满足条件的第一条记录
print(x)	#=>(1600, '张三', 3.81, '2000-09-12', 18, None)
print(x[1])	#=>张三
for x in cur.fetchall(): #fetchall取得所有满足条件的记录
	print(x[:-2]) #age和picture字段不打出
cur.execute("SELECT * FROM students WHERE name= 'Jack'")
x = cur.fetchone()
if x == None:
	print("can't find Jack")
cur.close()
db.close()

(1600, '张三', 3.81, '2000-09-12', 18, None)
张三
(1700, '李四', 3.25, '2001-12-01')
(1800, '王五', 3.35, '1999-01-01')
can't find Jack

检索数据库
import sqlite3
db = sqlite3.connect("c:/tmp/test2.db")
cur = db.cursor()
sql = 'select name, gpa, age from students where gpa > 3.3 order by age desc'	#查找gpa > 3.3的记录,题取其中三个字段,按年龄降序排列

cur.execute(sql)
x = cur.fetchall()
if x != []:
	print("total: ", len(x)) #=>2
	for r in x:
		print(r)
cur.close()
db.close()

1
2
3

total:2
('王五', 3.35, 19)
('张三', 3.81, 18)

UPDATE

UPDATE students SET gpa = 3.9
	#将所有记录的gpa设置成3.9
UPDATE students SET gpa = 3.9, age = 18 WHERE name = '李四'
	#修改 李四 的gpa和年龄

import sqlite3
db = sqlite3.connect("c:/tmp/test2.db")
cur = db.cursor()
sql = 'UPDATE students SET gpa =?, age = ? WHERE name = ?'
cur.execute(sql, (4.0, 20, '李四')) #元组三个元素分别对应三个?
#修改李四的gpa和年龄。若李四不存在，则无效果
db.commit() #写入操作必须
cur.close()
db.close()

DELETE

DELETE FROM students WHERE age < 18
	#删除年龄小于18的记录
DELETE FROM students
	#删除全部记录
	#别忘了最后commit

DROP TABLE

1 2	DROP TABLE IF EXISTS students #删除students表

别忘了最后commit

import sqlite3
db = sqlite3.connect("c:/tmp/test2.db")
cur = db.cursor()
cur.execute("DROP TABLE IF EXISTS students")
db.commit()

try:
	cur.execute("select * from students")
	x = cur.fetchall()
	for r in x:
		print(r[:-1])
except:
	print("no table")	#=> no table
cur.close()
db.close()

列出数据库中所有的表和表的结构

import sqlite3
db = sqlite3.connect("c:/tmp/test3.db")
cur = db.cursor()
sql = 'CREATE TABLE if not exists table2(id integer, name text)'
cur.execute(sql) #执行SQL命令
sq| = 'CREATE TABLE if not exists table1(id integer, schook text)'
cur.execute(sql)
db.commit()

cur.execute('select name from SQLITE_MASTER where type="table" order by NAME')
x = cur.fetchall()
if x != []:
	print (x)

cur.execute ("PRAGMA TABLE_INFO (table1)")
print(cur.fetchall())
cur.close()
db.close()

1 2	[('table1', ), ('table2', )] [(0, 'id', 'integer', 0, None, 0), (1, 'schook', 'text', 0, None, 0)]

对于修改表的操作，如插入，删除，更新，关闭数据库前不要忘了commit，否则可能无效
必要时用try… except语句来避免数据库不存在，表不存在时的导致的runtime error

数据库二进制字段

设置blob字段（二进制字段）的值：

import sqlite3
import requests	#访问网络资源

f = open('c:/tmp/tmp.jpg', 'rb')	#二进制方式打开图片
img = f.read()
f.close()

db = sqlite3.connect("c:/tmp/test2.db")
cur = db.cursor()
sql = "UPDATE students SET picture=? WHERE name = '李四'"
cur.execute(sq|, (img,)) #设置李四的照片。img对应于?


imgUrl = "https://img5.duitang.com/uploads/item/201605/19/20160519224441_VfMWRjpeg" #从网络获取图片
imgStream = requests.get(imgUrl, stream=True)
sql = "UPDATE students SET picture=? WHERE name = '张三' "
cur.execute(sql, (imgStream.content, )) #设置张三的照片。img对应于?
db.commit()
cur.close()
db.close()

读取blob字段(二进制字段)的值

import sq|ite3
import requests
db = sqlite3.connect("c:/tmp/test2.db")
cur = db.cursor()
sq| = "select name, picture from students WHERE name = '张三' or name = ' 李四'"
cur.execute(sql)
x = cur.fetchall()

for r in x: # r[0]是姓名,r[1]是图片文件数据
	f = open("c:/tmp/" + r[0] + ".jpg", "wb") #照片写入张三.jpg和李四.jpg
	f.write(r[1])
	f.close()
cur.close()
db.close()

实用Python程序设计MOOC-第八章文件读写和文件夹操作和数据库

文本文件读写

文本文件读写概述

创建文件并写入内容

文本文件读写

读取现有文件

读取现有文件

用readline读文件中的一行

添加文件内容

文本文件的编码

python程序的编码

创建文件和读取文件时都可以指定编码

文件的路径

open文件名参数的相对路径形式和绝对路径形式

相对路径形式:文件名没有包含盘符

绝对路径形式:文件名包含盘符

Python程序的“当前文件夹(当前路径，当前目录)

文件夹操作

Python的文件夹操作函数

os库和shutil库

删除文件夹的递归函数

获取文件夹总大小的递归函数

命令行参数

以命令行方式运行python程序

命令行参数

程序以命令行运行时的当前文件夹

文件处理实例

程序1:统计文章中的单词词频

程序2:统计多个文件累计单词频率

程序3:准确统计文章中的单词词频

程序4:countfile_nocet4.py

数据库和SQL语言

数据库的概念

数据库中的表

字段

SQL数据库查询语句

CREATE

INSERT

创建数据库

创建数据库并写入数据

数据库查询和修改

SELECT

检索数据库

UPDATE

DELETE

DROP TABLE

列出数据库中所有的表和表的结构

数据库二进制字段

设置blob字段（二进制字段）的值：

读取blob字段(二进制字段)的值

程序4:`countfile_nocet4.py`