» Most friendly to Excel users, Most Interactive, Data Analysis Programming

Excel-like Grid-style programming

Debugging SQL can be extremely cumbersome

SPLUser-friendly Debugging IDE

Debugging Python is also tedious

SELECT CODE, MAX(con_rise) AS longest_up_days

FROM (

SELECT CODE, COUNT(*) AS con_rise

FROM (

SELECT CODE, DT,

SUM(updown_flag) OVER (PARTITION BY CODE ORDER BY CODE, DT) AS no_up_days

FROM (

SELECT CODE, DT,

CASE WHEN CL > LAG(CL) OVER (PARTITION BY CODE ORDER BY CODE, DT) THEN 0

ELSE 1 END AS updown_flag

FROM stock

)

GROUP BY CODE, no_up_days

)

GROUP BY CODE

Extract SQL to debug intermediate steps

Employ print method to output intermediate results

High Interactivity for Exploratory Analysis

XLL Plugin Helps Excel

Write SPL code in Excel directly Finding periods during which stocks have risen consecutively for more than 5 days

=spl("=E(?1).sort(CODE,DT).group@i(CODE!=CODE[-1]||CL < CL[-1]).select(~.len()>=5).conj()",A1:D253)

Comprehensive and Simple Operations

Unique Set and Ordered Operations

calculate the longest consecutive rising days for each stock

SPL

	A
2	StockRecords.xlsx
3	=T(A1).sort(DT)
4	=A2.group(CODE;~.group@i(CL < CL[-1]).max(~.len()):max_increase_days)

Especially skilled at complex scenarios such as order-related operations, sliding windows, and cross-row computations,much simpler than SQL or Python

Python

import pandas as pd
stock_file = "StockRecords.txt"
stock_info = pd.read_csv(stock_file,sep="\t")
stock_info.sort_values(by=['CODE','DT'],inplace=True)
stock_group = stock_info.groupby(by='CODE')
stock_info['label'] = stock_info.groupby('CODE')['CL'].diff().fillna(0).le(0).astype(int).cumsum()
max_increase_days = {}
for code, group in stock_info.groupby('CODE'):
    max_increase_days[code] = group.groupby('label').size().max() – 1
max_rise_df = pd.DataFrame(list(max_increase_days.items()), columns=['CODE', 'max_increase_days'])

SQL

SELECT CODE, MAX(con_rise) AS longest_up_days
FROM (
    SELECT CODE, COUNT(*) AS con_rise
    FROM (
        SELECT CODE, DT,  SUM(updown_flag) OVER (PARTITION BY CODE ORDER BY CODE, DT) AS no_up_days
        FROM (
            SELECT CODE, DT, 
                    CASE WHEN CL > LAG(CL) OVER (PARTITION BY CODE ORDER BY CODE, DT)  THEN 0
                    ELSE 1 END AS updown_flag
            FROM stock
        )
    )
    GROUP BY CODE, no_up_days
)
GROUP BY CODE

What to use for data analysis programming: SPL,Python or SPL?

Unique Set and Ordered Operations

calculate the longest consecutive rising days for each stock

SPL

	A
2	StockRecords.xlsx
3	=T(A1).sort(DT)
4	=A2.group(CODE;~.group@i(CL < CL[-1]).max(~.len()):max_increase_days)

Especially skilled at complex scenarios such as order-related operations, sliding windows, and cross-row computations,much simpler than SQL or Python

SQL

SELECT CODE, MAX(con_rise) AS longest_up_days
FROM (
    SELECT CODE, COUNT(*) AS con_rise
    FROM (
        SELECT CODE, DT,  SUM(updown_flag) OVER (PARTITION BY CODE ORDER BY CODE, DT) AS no_up_days
        FROM (
            SELECT CODE, DT, 
                    CASE WHEN CL > LAG(CL) OVER (PARTITION BY CODE ORDER BY CODE, DT)  THEN 0
                    ELSE 1 END AS updown_flag
            FROM stock
        )
    )
    GROUP BY CODE, no_up_days
)
GROUP BY CODE

Python

import pandas as pd
stock_file = "StockRecords.txt"
stock_info = pd.read_csv(stock_file,sep="\t")
stock_info.sort_values(by=['CODE','DT'],inplace=True)
stock_group = stock_info.groupby(by='CODE')
stock_info['label'] = stock_info.groupby('CODE')['CL'].diff().fillna(0).le(0).astype(int).cumsum()
max_increase_days = {}
for code, group in stock_info.groupby('CODE'):
    max_increase_days[code] = group.groupby('label').size().max() – 1
max_rise_df = pd.DataFrame(list(max_increase_days.items()), columns=['CODE', 'max_increase_days'])

What to use for data analysis programming: SPL,Python or SPL?

Easy Big Data and Parallel Support

Big Data

In Memory

	A
1	StockRecords.txt
2	=file(A1).import@t().sort(CODE,DT)
3	=A2.group(CODE;~.group@i(CL < CL[-1]).max(~.len()):mi)

External Storage

	A
1	StockRecords.txt
2	=file(A1).cursor@t().sort(CODE,DT)
3	=A2.group(CODE;~.group@i(CL < CL[-1]).max(~.len()):mi)

Parallel Computation

Serial

	A
1	StockRecords.txt
2	=file(A1).cursor@t().sortx(CODE,DT)
3	=A2.group@i(CODE!=CODE[-1]\|\|CL< CL[-1])
4	=A3.select(~.len()>=5).conj()

Parallel

	A
1	StockRecords.txt
2	=file(A1).cursor@tm().sortx(CODE,DT)
3	=A2.group@i(CODE!=CODE[-1]\|\|CL< CL[-1])
4	=A3.select(~.len()>=5).conj()

Most Friendly to Excel UserHighly Concise & InteractiveEasy SQL Puzzles CrackerData Analysis Programming

Novel grid-style programming

Excel-like Grid-style programming

High Interactivity for Exploratory Analysis

XLL Plugin Helps Excel

Concise and Powerful Code

Comprehensive and Simple Operations

Unique Set and Ordered Operations

SPL

Python

SQL

Unique Set and Ordered Operations

SPL

SQL

Python

Easy Big Data and Parallel Support

Big Data

In Memory

External Storage

Parallel Computation

Serial

Parallel

Lightweight and Portable