理解Python中的map、reduce和filter

2014-03-05

map、reduce和filter是python内置的三个很有趣的函数。

1、先看看什么是 iterable 对象


以内置的max函数为例子,查看其doc:

>>> print max.__doc__
max(iterable[, key=func]) -> value
max(a, b, c, ...[, key=func]) -> value

With a single iterable argument, return its largest item.
With two or more arguments, return the largest argument.

在max函数的第一种形式中,其第一个参数是一个 iterable 对象,既然这样,那么哪些是 iterable 对象呢?

>>> max('abcx')
>>> 'x'
>>> max('1234')
>>> '4'
>>> max((1,2,3))
>>> 3
>>> max([1,2,4])
>>> 4

参考https://www.ibm.com/developerworks/cn/opensource/os-cn-python-yield/,我们可以使用yield生成一个iterable 对象(也有其他的方式):

def my_range(start,end):
    ''' '''
    while start <= end:
        yield start
        start += 1

执行下面的代码:

for num in my_range(1, 4):
    print num
print max(my_range(1, 4))

将输出:

1
2
3
4
4

2、map


http://docs.python.org/2/library/functions.html#map中如此介绍map函数:

map(function, iterable, ...)
Apply function to every item of iterable and return a list of the results. If additional iterable arguments are passed, function must take that many arguments and is applied to the items from all iterables in parallel. If one iterable is shorter than another it is assumed to be extended with None items. If function is None, the identity function is assumed; if there are multiple arguments, map() returns a list consisting of tuples containing the corresponding items from all iterables (a kind of transpose operation). The iterable arguments may be a sequence or any iterable object; the result is always a list.

map函数使用自定义的function处理iterable中的每一个元素,将所有的处理结果以list的形式返回。例如:

def func(x):
    ''' '''
    return x*x

print map(func, [1,2,4,8])
print map(func, my_range(1, 4))

运行结果是:

[1, 4, 16, 64]
[1, 4, 9, 16]

也可以通过列表推导来实现:

print [x*x for x in [1,2,4,8]]

3、reduce


http://docs.python.org/2/library/functions.html#reduce中如下介绍reduce函数:

reduce(function, iterable[, initializer])
Apply function of two arguments cumulatively to the items of iterable, from left to right, so as to reduce the iterable to a single value. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5). The left argument, x, is the accumulated value and the right argument, y, is the update value from the iterable. If the optional initializer is present, it is placed before the items of the iterable in the calculation, and serves as a default when the iterable is empty. If initializer is not given and iterable contains only one item, the first item is returned.

这个已经介绍的很明了,

reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])

相当于计算

((((1+2)+3)+4)+5)

而:

reduce(lambda x, y: x+y, [1, 2, 3, 4, 5],6)

相当于计算

(((((6+1)+2)+3)+4)+5)

4、filter


http://docs.python.org/2/library/functions.html#filter中如下介绍filter函数:

filter(function, iterable)
Construct a list from those elements of iterable for which function returns true. iterable may be either a sequence, a container which supports iteration, or an iterator. If iterable is a string or a tuple, the result also has that type; otherwise it is always a list. If function is None, the identity function is assumed, that is, all elements of iterable that are false are removed.

Note that filter(function, iterable) is equivalent to [item for item in iterable if function(item)] if function is not None and [item for item in iterable if item] if function is None.

参数function(是函数)用于处理iterable中的每个元素,如果function处理某元素时候返回true,那么该元素将作为list的成员而返回。比如,过滤掉字符串中的字符a:

def func(x):
    ''' '''
    return x != 'a'

print filter(func, 'awake')

运行结果是:

wke

这也可以通过列表推导来实现:

print ''.join([x for x in 'awake' if x != 'a'])
( 完 )