Filter according to cumulative value

Task:Find the top n customers whose sales accounts for half of the total sales, and rank them by sales in descending order.

Python

1	import pandas as pd
2	sale_file = "E:\\txt\\sales_volume.csv"
3	sale_info = pd.read_csv(sale_file)
4	sale_info.sort_values(by='Amount',inplace=True,ascending=False)
5	half_amount = sale_info['Amount'].sum()/2
6	vip_list = []
7	amount = 0
8	for client_info in sale_info.itertuples():
9	amount += getattr(client_info, 'Amount')
10	if amount < half_amount:
11	vip_list.append(getattr(client_info, 'Client'))
12	else:
13	vip_list.append(getattr(client_info, 'Client'))
14	break
15	print(vip_list)

Pandas does not have an existing loop function to calculate the position that meets the conditions, so it can only use for loop to complete the calculation.

esProc

	A
1	E:\\txt\\sales_volume.csv	Data storage path
2	=file(A1).import@tc().sort@z(Amount)	Import data and sort by amount in descending order
3	=A2.sum(Amount)/2	Calculate half of total sales
4	=A2.pselect(cum(Amount)>=A3)	Find the position where the cumulative sales is more than half of total sales
5	=A2.(Client).m(:A4)	Take the client names before the position

It is recommended to use loop function to calculate in esProc, and filter and record the position information according to the conditions.

Filter according to cumulative value

Python

esProc

SPL vs Python