您的位置:首页 > 编程语言 > Python开发

sorting a python list by two criteria

2016-06-18 17:45 1511 查看


Sorting
a Python list by two criteria

up
vote59down
votefavorite
20

I have the following list created from a sorted csv
list1 = sorted(csv1, key=operator.itemgetter(1))


I would actually like to sort the list by two criteria: first by the value in field 1 and then by the value in field 2. How do I do this?

python sorting
shareimprove
this question
asked Mar 6 '11 at 19:36





half full
402158

 add
a comment


4 Answers

activeoldestvotes

up vote66down
voteaccepted
like this:
import operator
list1 = sorted(csv1, key=operator.itemgetter(1, 2))


shareimprove
this answer
edited Sep
4 '15 at 8:54





BDM
3321314

answered Mar 6 '11 at 19:38





mouad
31.2k66783

 
1 
+1: More elegant than mine. I forgot that itemgetter can take multiple indices. – dappawit Mar
6 '11 at 19:43
 
@half full: glad it help :) – mouad Mar
6 '11 at 19:53
6 
operator
 is
a module that needs to be imported. – trapicki Aug
28 '13 at 14:45 
 
how will i proceed if i want to sort ascending on one element and descending on other, using itemgetter??. – ashish Oct
12 '13 at 10:13 
1 
@ashish, see my answer below with lambda functions this is clear, sort by "-x[1]" or even "x[0]+x[1]" if you wish – jaap Feb
27 '14 at 15:15
show 1 more
comment


up vote100down
vote
Replying to this dead thread for archive.

No need to import anything when using lambda functions.

The following sorts 
list
 by
the first element, then by the second element.
sorted(list, key=lambda x: (x[0], -x[1]))


shareimprove
this answer
edited Jan
14 at 20:39

answered Jun 14 '13 at 13:01





jaap
1,4461917

 
 
I like this solution because you can convert strings to int for sorting like: 
lambda
x: (x[0],int(x[1]))
. +1 – pbible Sep
10 '14 at 13:41 
4 
Nice. As you noted in comment to the main answer above, this is the best (only?) way to do multiple sorts with different
sort orders. Perhaps highlight that. Also, your text does not indicate that you sorted descending on second element. – PeterVermont Jun
12 '15 at 14:25
 
Also what if  
x[1]
 is
date? Should I convert it to integer also? @pbible does conversion of string to int preserve the alphabetical ordering of the string? – user1700890 Nov
20 '15 at 23:32 
1 
@user1700890 I was assuming the field was already string. It should sort strings in alphabetical order by default. You
should post your own question separately on SO if it is not specifically related to the answer here or the OP's original question. – pbible Nov
23 '15 at 16:59
2 
@jan it's reverse sort – jaap Feb
28 at 23:39
show 7 more
comments
up vote8down
vote
Python has a stable sort, so provided that performance isn't an issue the simplest way is to sort it by field 2 and then sort it again by field 1.

That will give you the result you want, the only catch is that if it is a big list (or you want to sort it often) calling sort twice might be an unacceptable overhead.
list1 = sorted(csv1, key=operator.itemgetter(2))
list1 = sorted(list1, key=operator.itemgetter(1))


Doing it this way also makes it easy to handle the situation where you want some of the columns reverse sorted, just include the 'reverse=True' parameter when necessary.

Otherwise you can pass multiple parameters to itemgetter or manually build a tuple. That is probably going to be faster, but has the problem that it doesn't generalise well if some of the columns want to be reverse sorted (numeric columns can still be reversed
by negating them but that stops the sort being stable).

So if you don't need any columns reverse sorted, go for multiple arguments to itemgetter, if you might, and the columns aren't numeric or you want to keep the sort stable go for multiple consecutive sorts.

Edit: For the commenters who have problems understanding how this answers the original question, here is an example that shows exactly how the stable nature of the sorting ensures we can do separate sorts
on each key and end up with data sorted on multiple criteria:
DATA = [
('Jones', 'Jane', 58),
('Smith', 'Anne', 30),
('Jones', 'Fred', 30),
('Smith', 'John', 60),
('Smith', 'Fred', 30),
('Jones', 'Anne', 30),
('Smith', 'Jane', 58),
('Smith', 'Twin2', 3),
('Jones', 'John', 60),
('Smith', 'Twin1', 3),
('Jones', 'Twin1', 3),
('Jones', 'Twin2', 3)
]

# Sort by Surname, Age DESCENDING, Firstname
print("Initial data in random order")
for d in DATA:
print("{:10s} {:10s} {}".format(*d))

print('''
First we sort by first name, after this pass all
Twin1 come before Twin2 and Anne comes before Fred''')
DATA.sort(key=lambda row: row[1])

for d in DATA:
print("{:10s} {:10s} {}".format(*d))

print('''
Second pass: sort by age in descending order.
Note that after this pass rows are sorted by age but
Twin1/Twin2 and Anne/Fred pairs are still in correct
firstname order.''')
DATA.sort(key=lambda row: row[2], reverse=True)
for d in DATA:
print("{:10s} {:10s} {}".format(*d))

print('''
Final pass sorts the Jones from the Smiths.
Within each family members are sorted by age but equal
age members are sorted by first name.
''')
DATA.sort(key=lambda row: row[0])
for d in DATA:
print("{:10s} {:10s} {}".format(*d))


This is a runnable example, but to save people running it the output is:
Initial data in random order
Jones      Jane       58
Smith      Anne       30
Jones      Fred       30
Smith      John       60
Smith      Fred       30
Jones      Anne       30
Smith      Jane       58
Smith      Twin2      3
Jones      John       60
Smith      Twin1      3
Jones      Twin1      3
Jones      Twin2      3

First we sort by first name, after this pass all
Twin1 come before Twin2 and Anne comes before Fred
Smith      Anne       30
Jones      Anne       30
Jones      Fred       30
Smith      Fred       30
Jones      Jane       58
Smith      Jane       58
Smith      John       60
Jones      John       60
Smith      Twin1      3
Jones      Twin1      3
Smith      Twin2      3
Jones      Twin2      3

Second pass: sort by age in descending order.
Note that after this pass rows are sorted by age but
Twin1/Twin2 and Anne/Fred pairs are still in correct
firstname order.
Smith      John       60
Jones      John       60
Jones      Jane       58
Smith      Jane       58
Smith      Anne       30
Jones      Anne       30
Jones      Fred       30
Smith      Fred       30
Smith      Twin1      3
Jones      Twin1      3
Smith      Twin2      3
Jones      Twin2      3

Final pass sorts the Jones from the Smiths.
Within each family members are sorted by age but equal
age members are sorted by first name.

Jones      John       60
Jones      Jane       58
Jones      Anne       30
Jones      Fred       30
Jones      Twin1      3
Jones      Twin2      3
Smith      John       60
Smith      Jane       58
Smith      Anne       30
Smith      Fred       30
Smith      Twin1      3
Smith      Twin2      3


Note in particular how in the second step the 
reverse=True
 parameter
keeps the firstnames in order whereas simply sorting then reversing the list would lose the desired order for the third sort key.

shareimprove
this answer
edited Jan
22 '15 at 9:44

answered Mar 6 '11 at 19:49





Duncan
38.6k64394

 
 
Thanks that's very helpful. – half
full Mar
6 '11 at 19:53
 
Stable sorting doesn't mean that it won't forget what your previous sorting was. This answer is wrong. – Mike
Axiak Mar
6 '11 at 21:10
1 
Stable sorting means that you can sort by columns a, b, c simply by sorting by column c then b then a. Unless you care
to expand on your comment I think it is you that is mistaken. – Duncan Mar
6 '11 at 21:23 
2 
This answer is definitely correct, although for larger lists it's unideal: if the list was already partially sorted,
then you'll lose most of the optimization of Python's sorting by shuffling the list around a lot more. @Mike, you're incorrect; I suggest actually testing answers before declaring them wrong. – Glenn
Maynard Mar
6 '11 at 21:39
2 
@MikeAxiak: docs.python.org/2/library/stdtypes.html#index-29 states
in comment 9: Starting with Python 2.3, the sort() method is guaranteed to be stable. A sort is stable if it guarantees not to change the relative order of elements that compare equal — this is helpful
for sorting in multiple passes (for example, sort by department, then by salary grade). – trapicki Aug
28 '13 at 14:40 
show 5 more
comments
up vote3down
vote
def keyfunc(x):
return tuple(x[1],x[2])

list1 = sorted(csv1, key=keyfunc)


shareimprove
this answer
answered Mar 6 '11 at 19:41





dappawit
5,07711521

 
1 
I don't think 
tuple()
 can
receive two arguments (or rather, three, if you count with 
self
) – Filipe
CorreiaDec
12 '12 at 23:15
1 
tuple takes only can take one argument – therealprashant Jun
6 '15 at 11:02
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: