6 Major Advantage, esProc SPL helps data analysts.
1More Concise Code
e.g.Longest consecutive rising days of each stock
SPL is simpler, eliminating any loop statements.
SQL
SELECT CODE, MAX(con_rise) AS longest_up_days
FROM (
SELECT CODE, COUNT(*) AS con_rise
FROM (
SELECT CODE, DT,
SUM(updown_flag) OVER (PARTITION BY CODE ORDER BY CODE, DT) AS no_up_days
FROM (
SELECT CODE, DT,
CASE WHEN CL > LAG(CL) OVER (PARTITION BY CODE ORDER BY CODE, DT) THEN 0
ELSE 1 END AS updown_flag
FROM stock
)
)
GROUP BY CODE, no_up_days
)
GROUP BY CODE
SPL
| A |
1 | =stock.sort(StockRecords.txt) |
2 | =T(A1).sort(DT) |
3 | =A2.group(CODE;~.group@i(CL< CL[-1]).max(~.len()):max_increase_days) |
Python
import pandas as pd
stock_file = "StockRecords.txt"
stock_info = pd.read_csv(stock_file,sep="\t")
stock_info.sort_values(by=['CODE','DT'],inplace=True)
stock_group = stock_info.groupby(by='CODE')
stock_info['label'] = stock_info.groupby('CODE')['CL'].diff().fillna(0).le(0).astype(int).cumsum()
max_increase_days = {}
for code, group in stock_info.groupby('CODE'):
max_increase_days[code] = group.groupby('label').size().max() – 1
max_rise_df = pd.DataFrame(list(max_increase_days.items()), columns=['CODE', 'max_increase_days'])
2Strong Interactivity Suitable for Multi-Step Exploratory Analysis and Convenient Debugging
e.g.Find the periods of stock price increases lasting for more than 5 consecutive days.
3Built-in parallel out-of-memory calculations.
In-memory
| A |
1 | smallData.txt |
2 | =file(A1).import@t() |
3 | =A2.groups(state;sum(amount):amount) |
Out-of-memory
| A |
1 | bigData.txt |
2 | =file(A1).cursor@t() |
3 | =A2.groups(state;sum(amount):amount) |
Serial
| A |
1 | bigData.txt |
2 | =file(A1).cursor@t() |
3 | =A2.groups(state;sum(amount):amount) |
Parallel
| A |
1 | bigData.txt |
2 | =file(A1).cursor@tm() |
3 | =A2.groups(state;sum(amount):amount) |
4Lightweight portability
Runs smoothly on a regular laptop without the need for a server cluster.
Stores data in highly compressed files, making it easy to carry.
5Openness, directly calculate diverse data sources
6Pure Java Enables direct Integration of Explored Results into Enterprise Applications