久久综合狠狠综合久久综青草 ,日韩精品国产精品,亚洲va国产天堂va久久en

Frank_Fang — Sat, 22 Aug 2009 15:48:00 GMT

python 异常、正则表辑ּ�
http://docs.python.org/library/re.html
http://docs.python.org/howto/regex.html#regex-howto

�?6.1. 打开一个不存在的文�?br /> >>> fsock = open("/notthere", "r")
Traceback (innermost last):
File "", line 1, in ?
IOError: [Errno 2] No such file or directory: '/notthere'
>>> try:
...     fsock = open("/notthere")
... except IOError:
...     print "The file does not exist, exiting gracefully"
... print "This line will always print"
The file does not exist, exiting gracefully
This line will always print

# Bind the name getpass to the appropriate function
try:
      import termios, TERMIOS
except ImportError:
      try:
          import msvcrt
      except ImportError:
          try:
              from EasyDialogs import AskPassword
          except ImportError:
              getpass = default_getpass
          else:
              getpass = AskPassword
      else:
          getpass = win_getpass
else:
      getpass = unix_getpass

�?6.10. 遍历 dictionary
>>> import os
>>> for k, v in os.environ.items():
... print "%s=%s" % (k, v)
USERPROFILE=C:\Documents and Settings\mpilgrim
OS=Windows_NT
COMPUTERNAME=MPILGRIM
USERNAME=mpilgrim

[...�?..]
>>> print "\n".join(["%s=%s" % (k, v)
... for k, v in os.environ.items()])
USERPROFILE=C:\Documents and Settings\mpilgrim
OS=Windows_NT
COMPUTERNAME=MPILGRIM

�?6.13. 使用 sys.modules
>>> import fileinfo
>>> print '\n'.join(sys.modules.keys())
win32api
os.path
os
fileinfo
exceptions

>>> fileinfo

>>> sys.modules["fileinfo"]

下面的例子将展示通过�l�合使用 __module__ �c�d��性和 sys.modules dictionary 来获取已知类所在的模块�?

�?6.14. __module__ �c�d��?
>>> from fileinfo import MP3FileInfo
>>> MP3FileInfo.__module__
'fileinfo'
>>> sys.modules[MP3FileInfo.__module__]
每个 Python �c�都拥有一个内�|�的�c�d��?__module__�Q�它定义了这个类的模块的名字�?nbsp;
��它�?sys.modules 字典复合使用�Q�你可以得到定义了某个类的模块的引用�?nbsp;

�?6.16. 构造�\径名
>>> import os
>>> os.path.join("c:\\music\\ap\\", "mahadeva.mp3")
'c:\\music\\ap\\mahadeva.mp3'
>>> os.path.join("c:\\music\\ap", "mahadeva.mp3")
'c:\\music\\ap\\mahadeva.mp3'
>>> os.path.expanduser("~")
'c:\\Documents and Settings\\mpilgrim\\My Documents'
>>> os.path.join(os.path.expanduser("~"), "Python")
'c:\\Documents and Settings\\mpilgrim\\My Documents\\Python'

�?7.2. 匚w��整个单词
>>> s = '100 BROAD'
>>> re.sub('ROAD$', 'RD.', s)
'100 BRD.'
>>> re.sub('\\bROAD$', 'RD.', s)
'100 BROAD'
>>> re.sub(r'\bROAD$', 'RD.', s)
'100 BROAD'
>>> s = '100 BROAD ROAD APT. 3'
>>> re.sub(r'\bROAD$', 'RD.', s)
'100 BROAD ROAD APT. 3'
>>> re.sub(r'\bROAD\b', 'RD.', s)
'100 BROAD RD. APT 3'

我真正想要做的是�Q�当 'ROAD' 出现在字�W�串的末��，�q�且是作��Z��个独立的单词�Ӟ��而不是一些长单词的一部分�Q�才对他�q�行匚w��。�ؓ了在正则表达式中表达�q�个意思，你利�?\b�Q�它的含义是“单词的边界必��d��q�里”。在 Python 中，�׃��字符 '\' 在一个字�W�串中必��{义，�q�会变得非常�ȝ��。有时候，�q�类问题被称�?#8220;反斜�U�灾�?#8221;�Q�这也是 Perl 中正则表辑ּ��?Python 的正则表辑ּ�要相对容易的原因之一。另一斚w��Q�Perl 也�؜淆了正则表达式和其他语法�Q�因此，如果你发��C��?bug�Q�很隑ּ�清楚�I�竟是一个语法错误，�q�是一个正则表辑ּ�错误�?nbsp;
��Z��避免反斜�U�灾难，你可以利用所谓的“原始字符�?#8221;�Q�只要�ؓ字符串添加一个前�~� r ��可以了。这��告�?Python�Q�字�W�串中的所有字�W�都不�{义；'\t' 是一个制表符�Q��?r'\t' 是一个真正的反斜�U�字�W?'\'�Q�紧跟着一个字�?'t'。我推荐只要处理正则表达式，��׃��用原始字�W�串�Q�否则，事情会很快变得�؜�?(�q�且正则表达式自�׃��会很快被自己搞�ؕ�?�?nbsp;

�?7.4. ��验百位数
>>> import re
>>> pattern = '^M?M?M?(CM|CD|D?C?C?C?)$'
>>> re.search(pattern, 'MCM')

>>> re.search(pattern, 'MD')

>>> re.search(pattern, 'MMMCCC')

>>> re.search(pattern, 'MCMC')
>>> re.search(pattern, '')

�?7.5. 老方法：每一个字�W�都是可选的
>>> import re
>>> pattern = '^M?M?M?$'
>>> re.search(pattern, 'M')
<_sre.SRE_Match object at 0x008EE090>
>>> pattern = '^M?M?M?$'
>>> re.search(pattern, 'MM')
<_sre.SRE_Match object at 0x008EEB48>
>>> pattern = '^M?M?M?$'
>>> re.search(pattern, 'MMM')
<_sre.SRE_Match object at 0x008EE090>
>>> re.search(pattern, 'MMMM')
>>>

�?7.6. 一个新的方法：�?n �?m
>>> pattern = '^M{0,3}$'
>>> re.search(pattern, 'M')
<_sre.SRE_Match object at 0x008EEB48>
>>> re.search(pattern, 'MM')
<_sre.SRE_Match object at 0x008EE090>
>>> re.search(pattern, 'MMM')
<_sre.SRE_Match object at 0x008EEDA8>
>>> re.search(pattern, 'MMMM')
>>>

对于个位数的正则表达式有�c�M��的表达方式，我将省略�l�节�Q�直接展�C�结果�?/p>

>>> pattern = '^M?M?M?(CM|CD|D?C?C?C?)(XC|XL|L?X?X?X?)(IX|IV|V?I?I?I?)$'
用另一�U?{n,m} 语法表达�q�个正则表达式会如何呢？�q�个例子展示新的语法�?

�?7.8. �?{n,m} 语法��认�|�马数字
>>> pattern = '^M{0,3}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})$'
>>> re.search(pattern, 'MDLV')
<_sre.SRE_Match object at 0x008EEB48>
>>> re.search(pattern, 'MMDCLXVI')
<_sre.SRE_Match object at 0x008EEB48>

�?7.9. 带有内联注释 (Inline Comments) 的正则表辑ּ�
>>> pattern = """
    ^                   # beginning of string
    M{0,3}              # thousands - 0 to 3 M's
    (CM|CD|D?C{0,3})    # hundreds - 900 (CM), 400 (CD), 0-300 (0 to 3 C's),
                        #            or 500-800 (D, followed by 0 to 3 C's)
    (XC|XL|L?X{0,3})    # tens - 90 (XC), 40 (XL), 0-30 (0 to 3 X's),
                        #        or 50-80 (L, followed by 0 to 3 X's)
    (IX|IV|V?I{0,3})    # ones - 9 (IX), 4 (IV), 0-3 (0 to 3 I's),
                        #        or 5-8 (V, followed by 0 to 3 I's)
    $                   # end of string
    """
>>> re.search(pattern, 'M', re.VERBOSE)
<_sre.SRE_Match object at 0x008EEB48>
>>> re.search(pattern, 'MCMLXXXIX', re.VERBOSE)
<_sre.SRE_Match object at 0x008EEB48>
>>> re.search(pattern, 'MMMDCCCLXXXVIII', re.VERBOSE)
<_sre.SRE_Match object at 0x008EEB48>
>>> re.search(pattern, 'M')
当��用松散正则表辑ּ��Ӟ��最重要的一件事情就是：必须传递一个额外的参数 re.VERBOSE�Q�该参数是定义在 re 模块中的一个常量，标志着待匹配的正则表达式是一个松散正则表辑ּ�。正如你看到的，�q�个模式中，有很多空�?(所有的�I�格都被忽略)�Q�和几个注释 (所有的注释也被忽略)。如果忽略所有的�I�格和注释，它就和前面章节里的正则表辑ּ�完全相同�Q�但是具有更好的可读性�?nbsp;
>>> re.search(pattern, 'M')
�q�个没有匚w��。�ؓ什么呢�Q�因为没�?re.VERBOSE 标记�Q�所�?re.search 函数把模式作��Z��个紧凑正则表辑ּ��q�行匚w��。Python 不能自动��一个正则表辑ּ�是�ؓ松散�c�d��q�是紧凑�c�d��。Python 默认每一个正则表辑ּ�都是紧凑�c�d��的，除非你显式地标明一个正则表辑ּ�为松散类型�?

�?7.16. 解析电话��L�� (最�l�版�?
>>> phonePattern = re.compile(r'''
                # don't match beginning of string, number can start anywhere
    (\d{3})     # area code is 3 digits (e.g. '800')
    \D*         # optional separator is any number of non-digits
    (\d{3})     # trunk is 3 digits (e.g. '555')
    \D*         # optional separator
    (\d{4})     # rest of number is 4 digits (e.g. '1212')
    \D*         # optional separator
    (\d*)       # extension is optional and can be any number of digits
    $           # end of string
    ''', re.VERBOSE)
>>> phonePattern.search('work 1-(800) 555.1212 #1234').groups()
('800', '555', '1212', '1234')
>>> phonePattern.search('800-555-1212')
('800', '555', '1212', '')

现在�Q�你应该熟悉下列技巧：

^ 匚w��字符串的开始�?
$ 匚w��字符串的�l�尾�?
\b 匚w��一个单词的边界�?
\d 匚w��L��数字�?
\D 匚w��L��非数字字�W��?
x? 匚w��一个可选的 x 字符 (换言之，它匹�?1 �ơ或�?0 ��?x 字符)�?
x* 匚w��0�ơ或者多��?x 字符�?
x+ 匚w��1�ơ或者多��?x 字符�?
x{n,m} 匚w�� x 字符�Q�至��?n �ơ，臛_�� m �ơ�?
(a|b|c) 要么匚w�� a�Q�要么匹�?b�Q�要么匹�?c�?
(x) 一般情况下表示一个记忆组 (remembered group)。你可以利用 re.search 函数�q�回对象�?groups() 函数获取它的倹{�?

http://www.woodpecker.org.cn/diveintopython/regular_expressions/phone_numbers.html

Regular expression pattern syntax
Element	Meaning
.	Matches any character except `\n` (if `DOTALL`, also matches `\n`)
^	Matches start of string (if `MULTILINE`, also matches after `\n`)
$	Matches end of string (if `MULTILINE`, also matches before `\n`)
*	Matches zero or more cases of the previous regular expression; greedy (match as many as possible)
+	Matches one or more cases of the previous regular expression; greedy (match as many as possible)
?	Matches zero or one case of the previous regular expression; greedy (match one if possible)
`*?` , `+?`, `??`	Non-greedy versions of `*`, `+`, and `?` (match as few as possible)
{`m`,`n`}	Matches `m` to `n` cases of the previous regular expression (greedy)
{`m`,`n`}?	Matches `m` to `n` cases of the previous regular expression (non-greedy)
[...]	Matches any one of a set of characters contained within the brackets
\|	Matches expression either preceding it or following it
(...)	Matches the regular expression within the parentheses and also indicates a group
(?iLmsux)	Alternate way to set optional flags; no effect on match
(?:...)	Like `(...)`, but does not indicate a group
(?P<`id`>...)	Like `(...)`, but the group also gets the name `id`
(?P=`id`)	Matches whatever was previously matched by group named `id`
(?#...)	Content of parentheses is just a comment; no effect on match
(?=...)	Lookahead assertion; matches if regular expression `..`. matches what comes next, but does not consume any part of the string
(?!...)	Negative lookahead assertion; matches if regular expression `..`. does not match what comes next, and does not consume any part of the string
(?<=...)	Lookbehind assertion; matches if there is a match for regular expression `..`. ending at the current position (`..`. must match a fixed length)
(?	Negative lookbehind assertion; matches if there is no match for regular expression `..`. ending at the current position (`..`. must match a fixed length)
\`number`	Matches whatever was previously matched by group numbered `number` (groups are automatically numbered from 1 up to 99)
\A	Matches an empty string, but only at the start of the whole string
\b	Matches an empty string, but only at the start or end of a word (a maximal sequence of alphanumeric characters; see also `\w`)
\B	Matches an empty string, but not at the start or end of a word
\d	Matches one digit, like the set `[0-9]`
\D	Matches one non-digit, like the set `[^0-9]`
\s	Matches a whitespace character, like the set `[` `\t\n\r\f\v]`
\S	Matches a non-white character, like the set `[^` `\t\n\r\f\v]`
\w	Matches one alphanumeric character; unless `LOCALE` or `UNICODE` is set, `\w` is like `[a-zA-Z0-9_]`
\W	Matches one non-alphanumeric character, the reverse of `\w`
\Z	Matches an empty string, but only at the end of the whole string
\\	Matches one backslash character

Frank_Fang 2009-08-22 23:48 发表评论

Python学习�W�记一

Frank_Fang — Fri, 21 Aug 2009 16:02:00 GMT

发现很多公司要求掌握一门脚本语�a��Q�一直也��x��搞这个，看C++�Q�Linux C�~�程也只是�ؓ了对�pȝ��更加的了解，唉，我的工作定位好像一直都不怎么明确�Q�是要搞个自己最擅长的了�Q�以后有旉��再搞linuxC�Q�找工作�W�一位。。。工作基本定位在 Java+python�Q�大部分公司也不会要求一个�h既做Java也做C++。再说这语言也是大同��异�Q�关键还是编�E�思想

d = {"server":"mpilgrim", "database":"master"}

li = ["a", "b", "mpilgrim", "z", "example"]

Tuple 是不可变�?list。一旦创��Z��一�?tuple�Q�就不能以�Q何方式改变它
t = ("a", "b", "mpilgrim", "z", "example")

�q�接 list 与分割字�W�串
>>> li = ['server=mpilgrim', 'uid=sa', 'database=master', 'pwd=secret']
>>> s = ";".join(li)
>>> s
'server=mpilgrim;uid=sa;database=master;pwd=secret'
>>> s.split(";")
['server=mpilgrim', 'uid=sa', 'database=master', 'pwd=secret']
>>> s.split(";", 1)
['server=mpilgrim', 'uid=sa;database=master;pwd=secret']

使用 type、str、dir 和其它内�|�函�?/p>

4.3.2. str 函数
str ��数据强制�{换�ؓ字符丌Ӏ�每�U�数据类型都可以强制转换为字�W�串�?

�?4.6. str 介绍
>>> str(1)
'1'
>>> horsemen = ['war', 'pestilence', 'famine']
>>> horsemen
['war', 'pestilence', 'famine']
>>> horsemen.append('Powerbuilder')
>>> str(horsemen)
"['war', 'pestilence', 'famine', 'Powerbuilder']"
>>> str(odbchelper)
""
>>> str(None)
'None' 对于��单的数据�c�d��比如整型�Q�你可以预料�?str 的正常工作，因�ؓ几乎每种语言都有一�?/p>

��整型�{化�ؓ字符串的函数�?nbsp;
然�?str 可以作用于�Q何数据类型的��M��对象。这里它作用于一个零��构建的列表�?nbsp;
str �q�允�怽�用于模块。注意模块的字符串�Ş式表�C�包含了模块在磁盘上的�\径名�Q�所以你的显�C?/p>

�l�果��会有所不同�?nbsp;
str 的一个细��但重要的行为是它可以作用于 None�Q�None �?Python �?null 倹{��这个调用返回字�W?/p>

�?'None'。你��会使用�q�一�Ҏ��改进你的 info 函数�Q�这一点你很快��׃��看到�?nbsp;

dir 函数�q�回��L��对象的属性和�Ҏ��列表�Q�包括模块对象、函数对象、字�W�串对象、列表对象、字

典对�?…… 相当多的东西�?

�?4.7. dir 介绍
>>> li = []
>>> dir(li)
['append', 'count', 'extend', 'index', 'insert',
'pop', 'remove', 'reverse', 'sort']
>>> d = {}
>>> dir(d)
['clear', 'copy', 'get', 'has_key', 'items', 'keys', 'setdefault', 'update', 'values']
>>> import odbchelper
>>> dir(odbchelper)
['__builtins__', '__doc__', '__file__', '__name__', 'buildConnectionString']

最后是 callable 函数�Q�它接收��M��对象作�ؓ参数�Q�如果参数对象是可调用的�Q�返�?True�Q�否则返�?

False。可调用对象包括函数、类�Ҏ��Q�甚至类自��n (下一章将更多的关注类)�?

�?4.8. callable 介绍
>>> import string
>>> string.punctuation
'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'
>>> string.join

>>> callable(string.punctuation)
False
>>> callable(string.join)
True

你已�l�知�?Python 函数是对象。你不知道的是，使用 getattr 函数�Q�可以得��C��个直到运行时才知

道名�U�的函数的引用�?

�?4.10. getattr 介绍
>>> li = ["Larry", "Curly"]
>>> li.pop

>>> getattr(li, "pop")

>>> getattr(li, "append")("Moe")
>>> li
["Larry", "Curly", "Moe"]
>>> getattr({}, "clear")

>>> getattr((), "pop")
Traceback (innermost last):
File "", line 1, in ?
AttributeError: 'tuple' object has no attribute 'pop'

4.4.1. 用于模块�?getattr
getattr 不仅仅适用于内�|�数据类型，也可作用于模块�?

�?4.11. apihelper.py 中的 getattr 函数
>>> import odbchelper
>>> odbchelper.buildConnectionString

>>> getattr(odbchelper, "buildConnectionString")

>>> object = odbchelper
>>> method = "buildConnectionString"
>>> getattr(object, method)

>>> type(getattr(object, method))

>>> import types
>>> type(getattr(object, method)) == types.FunctionType
True
>>> callable(getattr(object, method))
True

使用 getattr�Q�你能够获得同一函数的同一引用。通常�Q�getattr(object, "attribute") �{��h�?

object.attribute。如�?object 是一个模块的话，那么 attribute 可能是定义在模块中的��M��东西�Q�函�?/p>

、类或者全局变量�?/p>

��是相当于函数的指针
�?4.12. 使用getattr 创徏分发�?

import statsout

def output(data, format="text"):
    output_function = getattr(statsout, "output_%s" % format)
    return output_function(data)
output 函数接收一个必备参�?data�Q�和一个可选参�?format。如果没有指�?format 参数�Q�其�~�省

值是 text �q�完成普通文本输出函数的调用�?nbsp;
你可以连�?format 参数值和 "output_" 来创��Z��个函数名�U�C��为参数��|��然后�?statsout 模块中取

得该函数。这�U�方式允�总�后很�Ҏ��地扩展程序以支持其它的输出格式，而且无需修改分发函数�?/p>

所要做的仅仅是�?statsout 中添加一个函敎ͼ�比如 output_pdf�Q�之后只要将 “pdf” 作�ؓ format 的参

数��g��递给 output 函数卛_��?nbsp;
现在你可以简单地调用输出函数�Q�就像调用其它函��C��栗��output_function 变量是指�?statsout �?/p>

块中相应函数的引用�?nbsp;

你是否发现前面示例的一�?Bug�Q�即字符串和函数之间的松耦合�Q�而且没有错误��查。如果用户传

入一个格式参敎ͼ�但是�?statsout 中没有定义相应的格式输出函数�Q�会发生什么呢�Q�还好，getattr

会返�?None�Q�它会取代一个有效函数�ƈ被赋值给 output_function�Q�然后下一行调用函数的语句��会

��p�|�q�抛��Z��个异常。这�U�方式不好�?

值得庆幸的是�Q�getattr 能够使用可选的�W�三个参敎ͼ�一个缺省返回倹{�?/p>

[mapping-expression for element in source-list if filter-expression]

>>> li = ["a", "mpilgrim", "foo", "b", "c", "b", "d", "d"]
>>> [elem for elem in li if len(elem) > 1]
['mpilgrim', 'foo']
>>> [elem for elem in li if elem != "b"]
['a', 'mpilgrim', 'foo', 'c', 'd', 'd']
>>> [elem for elem in li if li.count(elem) == 1]
['a', 'mpilgrim', 'foo', 'c']

使用 and �Ӟ��在布��环境中从左到右演算表达式的倹{�?�?'、[]�?)、{}、None 在布��环境中为假�Q?/p>

其它��M��东西都�ؓ真。还好，几乎是所有东�ѝ��默认情况下�Q�布��环境中的类实例为真�Q�但是你�?/p>

以在�c�M��定义特定的方法��得类实例的演��gؓ假�?/p>

4.6.1. 使用 and-or 技�?
�?4.17. and-or 技巧介�l?
>>> a = "first"
>>> b = "second"
>>> 1 and a or b
'first'
>>> 0 and a or b
'second'
�q�个语法看�v来类��g�� C 语言中的 bool ? a : b 表达式。整个表辑ּ�从左到右�q�行演算�Q�所以先�q?/p>

�?and 表达式的演算�? and 'first' 演算��gؓ 'first'�Q�然�?'first' or 'second' 的演��gؓ 'first'�?nbsp;
0 and 'first' 演算��gؓ False�Q�然�?0 or 'second' 演算��gؓ 'second'�?nbsp;

然而，�׃��q�种 Python 表达式单单只是进行布��逻辑�q�算�Q��ƈ不是语言的特定构成，�q�是 and-or

技巧和 C 语言中的 bool ? a : b 语法非常重要的不同。如�?a 为假�Q�表辑ּ��׃��会按你期望的那样

工作了�?你能知道我被�q�个问题折腾�q�吗�Q�不止一�ơ？)

Python 支持一�U�有��的语法�Q�它允许你快速定义单行的最��函数。这些叫�?lambda 的函敎ͼ�是从

Lisp 借用来的�Q�可以用在�Q何需要函数的地方�?br /> �?4.20. lambda 函数介绍
>>> def f(x):
... return x*2
...
>>> f(3)
6
>>> g = lambda x: x*2
>>> g(3)
6
>>> (lambda x: x*2)(3)
6

�ȝ��来说�Q�lambda 函数可以接收��L��多个参数 (包括可选参�? �q�且�q�回单个表达式的倹{��lambda

函数不能包含命��o�Q�包含的表达式不能超�q�一个。不要试囑֐� lambda 函数中塞入太多的东西�Q�如

果你需要更复杂的东西，应该定义一个普通函敎ͼ�然后惌��它多长就多长�?

�?4.25. 打印列表
>>> li = ['a', 'b', 'c']
>>> print "\n".join(li)
a
b
c 在你处理列表�Ӟ��q�确实是一个有用的调试技巧。在 Python 中，你会十分频繁地操作列表�?nbsp;

�?2.2.1 版本之前�Q�Python 没有单独的布��数据类型。�ؓ了��I补这个缺��P��Python 在布��环�?(�?if

语句) 中几乎接受所有东西，遵��@下面的规则：
0 �?false; 其它所有数值皆�?true�?
�I�Z�� ("") �?false; 其它所有字�W�串皆�ؓ true�?
�I?list ([]) �?false; 其它所�?list 皆�ؓ true�?
�I?tuple (()) �?false; 其它所�?tuple 皆�ؓ true�?
�I?dictionary ({}) �?false; 其它所�?dictionary 皆�ؓ true�?

下面�?from module import 的基本语法：

from UserDict import UserDict
它与你所熟知�?import module 语法很相��|��但是有一个重要的区别�Q�UserDict 被直接导入到局�?/p>

名字�I�间��M��Q�所以它可以直接使用�Q�而不需要加上模块名的限定。你可以导入独立的项或��?

from module import * 来导入所有东�ѝ�?

Python 中的 from module import * �?Java 中的 import module.* �Q�Python 中的 import module �?Java

中的 import module

什么时候你应该使用 from module import�Q?

如果你要�l�常讉K��模块的属性和�Ҏ��Q�且不想一遍又一遍地敲入模块名，使用 from module import

�?
如果你想要有选择地导入某些属性和�Ҏ��Q�而不惌��其它的，使用 from module import�?
如果模块包含的属性和�Ҏ��与你的某个模块同名，你必��M��?import module 来避免名字冲�H��?/p>

��量��用 from module import * �Q�因为判定一个特�D�的函数或属性是从哪来的有些困难�Q��ƈ且会�?/p>

成调试和重构都更困难�?/p>

from UserDict import UserDict

class FileInfo(UserDict):
�?Python 中，�cȝ��基类只是��单地列在�c�d��后面的小括号里。不像在 Java 中有一个特�D�的

extends 关键字�?nbsp;

Python 支持多重�l�承。在�c�d��后面的小括号中，你可以列��多你惌��的类名，以逗号分隔�?

class FileInfo(UserDict):
    "store file metadata"
    def __init__(self, filename=None):
        UserDict.__init__(self)
        self["name"] = filename
                                         一些伪面向对象语言�Q�像 Powerbuilder 有一�U?#8220;扩展”构造函数和其它事�g�?/p>

概念�Q�即父类的方法在子类的方法执行前被自动调用。Python 不是�q�样�Q�你必须昄��地调用在父类中的合适方法�?nbsp;
我告诉过你，�q�个�c�d��字典一样工作，那么�q�里��是�W�一个印象。我们将参数 filename 赋值给对象 name 关键字，作�ؓ它的倹{�?nbsp; 注意 __init__ �Ҏ��从不�q�回一个倹{�?nbsp;
Java中是自动调用默认的无参的父类的构造函�?/p>

�?5.9. 定义 UserDict �c?

class UserDict:
    def __init__(self, dict=None):
        self.data = {} ×××××××××××××××××××××××××××××××××××定义�cȝ��实例变量
        if dict is not None: self.update(dict)
注意 UserDict 是一个基�c�，不是从�Q何其他类�l�承而来�?nbsp;
�q�就是我们在 FileInfo �c�M��q�行了覆盖的 __init__ �Ҏ��。注意这个父�cȝ��参数列表与子�c�M��同。很

好，每个子类可以拥有自已的参数集�Q�只要��用正��的参数调用父类��可以了。这里父�c�L��一个定

义初始值的�Ҏ�� (通过�?dict 参数中传入一个字�?�Q�这一�Ҏ��我们�?FileInfo 没有用上�?br />

××××××××××××××××××××××××××××××××与Java的不�?#215;××××××××××××××××××××××××××××××××××
与Java的不同，
1�Q�Python中类的类属性是是在�cȝ��后面直接定义�?br /> 2�Q�而实例变量是在在__init__�Ҏ��中直接定义的使用如下方式self.instancevariable=***定义,
3�Q�然后Python中实例方法都要显�C�的加上self�Q�相当于Java的this)参数�Q�方法中引用实例�?span style="color: red">量时也要通过self来引�?/span>

Python 支持数据属�?(�?Java 叫做 “实例变量”�Q�在 C++ 中叫 “数据成员”)�Q�它是由某个特定的类�?/p>

例所拥有的数据。在本例中，每个 UserDict 实例��拥有一�?data 数据属性。要从类外的代码引用

�q�个属性，需要用实例的名字限定它�Q�instance.data�Q�限定的�Ҏ��与你用模块的名字来限定函��C��

栗��要在类的内部引用一个数据属性，我们使用 self 作�ؓ限定�W�。习惯上�Q�所有的数据属性都�?

__init__ �Ҏ��中初始化为有意义的倹{��然而，�q��ƈ不是必须的，因�ؓ数据属性，像局部变量一��P��

当你首次赋给它值的时候突然��生�?nbsp;
×××××××××××××××××××××××××××××××××与Java的不�?#215;×××××××××××××××××××××××××××××××××

Java 支持通过参数列表的重载，也就�?一个类可以有同名的多个�Ҏ��Q�但�q�些�Ҏ��或者是参数�?/p>

��C��同，或者是参数的类型不同�?br /> Python 两种都不支持�Q��M��是没有�Q何�Ş式的函数重蝲。一�?__init__ �Ҏ��是一�?__init__ �Ҏ��

�Q�不��它有什么样的参数。每个类只能有一�?__init__ �Ҏ��Q��ƈ且如果一个子�c�L��有一�?__init__

�Ҏ��Q�它��L�� 覆盖父类�?__init__ �Ҏ��Q�甚臛_��c�d��以用不同的参数列表来定义它�?

��x��Z��么不支持�Ҏ��重蝲�Q�是因�ؓPython的参数可以定义默认实参，有缺省�?/strong>
×××××××××××××××××××××××××××××××××××××××××××××××××××××××××××××××××××××××××××××
应该��L��?__init__ �Ҏ��中给一个实例的所有数据属性赋予一个初始倹{��这样做��会节省你在后面

调试的时��_��不必为捕捉因使用未初始化 (也就是不存在) 的属性而导致的 AttributeError 异常�Ҏ��?/p>
力�?
class MP3FileInfo(FileInfo):
mp3file.__class__ is fileinfo.MP3FileInfo true
mp3file.__class__ is fileinfo.FileInfo         false
isinstance(mp3file,fileinfo.MP3FileInfo) true
isinstance(mp3file,fileinfo.FileInfo)          true

li=[1,2,3]
li2[1,2,3]
li==li2 true   相当于Java的equals()
li is li2 false 相当于Java�?=

5.7. 高��专用�c�L��?
除了 __getitem__ �?__setitem__ 之外 Python �q�有更多的专用函数。某些可以让你模拟出你甚臛_��

能不知道的功能�?

下面的例子将展示 UserDict 一些其他专用方法�?

�?5.16. UserDict 中更多的专用�Ҏ��
    def __repr__(self): return repr(self.data)
    def __cmp__(self, dict):
        if isinstance(dict, UserDict):
            return cmp(self.data, dict.data)
        else:
            return cmp(self.data, dict)
    def __len__(self): return len(self.data)
    def __delitem__(self, key): del self.data[key]   __repr__ 是一个专用的�Ҏ��Q�在当调�?repr

(instance) 时被调用。repr 函数是一个内�|�函敎ͼ�它返回一个对象的字符串表�C�。它可以用在��M��

对象上，不仅仅是�cȝ��实例。你已经�?repr 相当熟悉了，��管你不知道它。在交互式窗口中�Q�当�?/p>
只敲入一个变量名�Q�接着按ENTER�Q�Python 使用 repr 来显�C�变量的倹{��自已用一些数据来创徏一

个字�?d �Q�然后用 print repr(d) 来看一看吧�?nbsp;
__cmp__ 在比较类实例时被调用。通常�Q�你可以通过使用 == 比较��L��两个 Python 对象�Q�不只是

�c�d��例。有一些规则，定义了何时内�|�数据类型被认�ؓ是相�{�的�Q�例如，字典在有着全部相同的关

键字和值时是相�{�的。对于类实例�Q�你可以定义 __cmp__ �Ҏ��Q�自已编写比较逻辑�Q�然后你可以

使用 == 来比较你的类�Q�Python ��会替你调用你的 __cmp__ 专用�Ҏ��?nbsp;
__len__ 在调�?len(instance) 时被调用。len 是一个内�|�函敎ͼ�可以�q�回一个对象的长度。它可以

用于��M��被认为理应有长度的对象。字�W�串�?len 是它的字�W�个敎ͼ�字典�?len 是它的关键字的个

敎ͼ�列表或序列的 len 是元素的个数。对于类实例�Q�定�?__len__ �Ҏ��Q�接着自已�~�写长度的计��?/p>
�Q�然后调�?len(instance)�Q�Python ��替你调用你�?__len__ 专用�Ҏ��?nbsp;
__delitem__ 在调�?del instance[key] 时调�?�Q�你可能记得它作��Z��字典中删除单个元素的�Ҏ��?/p>
当你在类实例中��?del �Ӟ��Python 替你调用 __delitem__ 专用�Ҏ��?nbsp;

�?Java 中，通过使用 str1 == str2 可以��定两个字符串变量是否指向同一块物理内存位�|�。这叫做

对象同一性，�?Python 中写�?str1 is str2。在 Java 中要比较两个字符串��|��你要使用 str1.equals

(str2)�Q�在 Python 中，你要使用 str1 == str2。某�?Java �E�序员，他们已经被教授得认�ؓ�Q�正是因�?/p>
�?Java �?== 是通过同一性而不是��D��行比较，所以世界才会更��好。这些�h要接�?Python 的这

�?#8220;严重�~�失”可能要花些时间�?nbsp;

ord("a") 97
ord("A") 65

5.8. �c�d��性介�l?
你已�l�知道了数据属性，它们是被一个特定的�c�d��例所拥有的变量。Python 也支持类属性，它们�?/p>
��q��本��n所拥有的�?

�?5.17. �c�d��性介�l?/p>
class MP3FileInfo(FileInfo):
    "store ID3v1.0 MP3 tags"
    tagDataMap = {"title"   : ( 3, 33, stripnulls),
                  "artist" : ( 33, 63, stripnulls),
                  "album"   : ( 63, 93, stripnulls),
                  "year"    : ( 93, 97, stripnulls),
                  "comment" : ( 97, 126, stripnulls),
                  "genre"   : (127, 128, ord)}

�?5.18. 修改�c�d��?Java中的静态变�?br /> >>> class counter:
...     count = 0
...     def __init__(self):
...         self.__class__.count += 1 #一定得用self.__class__来引用，才是�c�d��量，假如用self.count则定义的是实例变�?或者用
            Counter.count +=1
...
上述代码记录创徏的对象的个数

5.9. �U�有函数
与大多数语言一��P��Python 也有�U�有的概念：

�U�有函数不可以从它们的模块外面被调用
�U�有�c�L��法不能够从它们的�c�d��面被调用
�U�有属性不能够从它们的�c�d��面被讉K��
与大多数的语�a�不同�Q�一�?Python 函数�Q�方法，或属性是�U�有�q�是公有�Q�完全取决于它的名字�?

如果一�?Python 函数�Q�类�Ҏ��Q�或属性的名字以两个下划线开�?(但不是结�?�Q�它是私有的�Q�其�?/p>
所有的都是公有的�?Python 没有�c�L��法保�?的概�?(只能用于它们自已的类和子�c�M��)。类�Ҏ��?/p>
者是�U�有 (只能在它们自已的�c�M��使用) 或者是公有 (��M��地方都可使用)�?

�?MP3FileInfo 中，有两个方法：__parse �?__setitem__。正如我们已�l�讨��的，__setitem__

是一个专有方法；通常�Q�你不直接调用它�Q�而是通过在一个类上��用字典语法来调用�Q�但它是公有

的，�q�且如果有一个真正好的理由，你可以直接调用它 (甚至�?fileinfo 模块的外�?。然而，

__parse 是私有的�Q�因为在它的名字前面有两个下划线�?

�?Python 中，所有的专用�Ҏ�� (�?__setitem__) 和内�|�属�?(�?__doc__) 遵守一个标准的命名习惯

�Q�开始和�l�束都有两个下划�Uѝ��不要对你自已的�Ҏ��和属性用�q�种�Ҏ��命名�Q�到最后，它只会把�?(或其它�h) 搞�ؕ�?

#!/usr/bin/env python
#coding=utf-8
import os
import sys
from UserDict import UserDict

def stripnulls(data):
    "strip whitespace and null"
    return data.replace("\00","").strip()

class FileInfo(UserDict):
    "store file metadata"
    def __init__(self,filename=None):
        UserDict.__init__(self)
        #will trigger the __setitem__ method,and this method be override in the sub class
        self["name"]=filename

class MP3FileInfo(FileInfo):
    "store ID3v1.0 MP3 tags"
    tagDataMap={"title":(3,33,stripnulls),
                "artist":(33,63,stripnulls),
                "album" :(63,93,stripnulls),
                "year"  :(93,97,stripnulls),
                "comment":(97,126,stripnulls),
                "genre" :(127,128,ord)}

    def __parse(self,filename):
        "parse ID3v1.0 tags from Mp3 file"
        self.clear()
        try:
            fsock = open(filename,"rb",0)
            try:
                fsock.seek(-128,2)
                tagdata = fsock.read(128)
            finally:
                fsock.close()
            if tagdata[:3]=="TAG":
                for tag,(start,end,parseFunc) in self.tagDataMap.items():
                    self[tag] = parseFunc(tagdata[start:end])
        except IOError:
            pass

    def __setitem__(self,key,item):
        if key == "name" and item:
            self.__parse(item)
        FileInfo.__setitem__(self,key,item)

def listDirectory(directory,fileExtList):
    "get list of file info object for files of particular"
    fileList = [os.path.normcase(f)
                for f in os.listdir(directory)]
    fileList = [os.path.join(directory,f)
                for f in fileList if os.path.splitext(f)[1] in fileExtList]
    def getFileInfoClass(filename,module=sys.modules[FileInfo.__module__]):
        "get file into class from filename extension"
        subclass = "%sFileInfo" % os.path.splitext(filename)[1].upper()[1:]
        return hasattr(module,subclass) and getattr(module,subclass) or FileInfo
    return [getFileInfoClass(f)(f) for f in fileList]

if __name__=="__main__":
    for info in listDirectory("G:\\test",[".mp3"]):
        print "\n".join(["%s=%s" % (k,v) for (k,v) in info.items()])
        print




Frank_Fang 2009-08-22 00:02 发表评论

久久综合狠狠综合久久综青草 ,日韩精品国产精品,亚洲va国产天堂va久久en

Regular expression pattern syntax

Python学习�W�记一