» esProc Desktop - esProc SPL Official Website

What is esProc Desktop?

Programming Language

that can be learned by non-professionals

Toolkits

for desktop data analysis and processing

What does esProc solve?

Complex calculations and transformations that are difficult to do directly in Excel
Multi-step interactive data analysis that BI software cannot perform
Repetitive processing batch files (xls/csv/...) on hand (querying/calculating/generating/converting)

How to do these now?

Excel

Simple operation, rich functions

Tedious batch and repetitive tasks

Insufficient support for complex operations

Turn to BI？ Tableau / PowerBI / Qlik / TIBCO / Objects BI …

Simple to use, smooth and beautiful interface

Single calculation, can only do rigid multidimensional analysis

Turn to programming？ VBA

Full programming capabilities, naturally built in Excel

Poor support for table calculations, large section of code for simple tasks

Turn to programming？ Python

Support table calculation, good interactivity, training courses everywhere

Table calculation does not conform to natural thinking
Too many types of tables need to be mastered
The difficulty is far beyond non-professionals

df = pd.read_csv("../login_data.csv")
df["ts"] = pd.to_datetime(df["ts"]).dt.date
grouped = df.groupby("userid")
aligned_dates = pd.date_range(start=df["ts"].min(), end=df["ts"].max(), freq='D')
user_date_wether_con3days = []
for uid, group in grouped:
    group = group.drop_duplicates('ts')
    aligned_group = group.set_index("ts").reindex(aligned_dates)
    consecutive_logins = aligned_group.rolling(window=7)
    n = 0
    date_wether_con3days = []
    for r in consecutive_logins:
        n += 1
        if n<7:
            continue
        else:
            ds = r['userid'].isna().cumsum()
            cont_login_times = r.groupby(ds).userid.count().max()
            wether_cont3days = 1 if cont_login_times>=3 else 0
            date_wether_con3days.append(wether_cont3days)
    user_date_wether_con3days.append(date_wether_con3days)
arr = np.array(user_date_wether_con3days)
day7_cont3num = np.sum(arr,axis=0)
result = pd.DataFrame({'dt':aligned_dates[6:],'cont3_num':day7_cont3num})

Turn to database？ SQL

True set of records, with interactivity

Complex environment, cannot directly process desktop files, too hard for non-professionals to use

The complexity of procedural calculation is too high for non-professionals to master

WITH all_dates AS (
	SELECT DISTINCT TRUNC(ts) AS login_date
	FROM login_data),
user_login_counts AS (
	SELECT userid, TRUNC(ts) AS login_date, 
		(CASE WHEN COUNT(*)>=1 THEN 1 ELSE 0 END) AS login_count
	FROM login_data
	GROUP BY userid, TRUNC(ts)),
whether_login AS (
	SELECT u.userid, ad.login_date, NVL(ulc.login_count, 0) AS login_count
	FROM all_dates ad
	CROSS JOIN (
		SELECT DISTINCT userid
		FROM login_data) u
	LEFT JOIN user_login_counts ulc
	ON u.userid = ulc.userid
	AND ad.login_date = ulc.login_date
	ORDER BY u.userid, ad.login_date),
whether_login_rn AS (
	SELECT userid,login_date,login_count,ROWNUM AS rn 
	FROM whether_login),
whether_eq AS(
	SELECT userid,login_date,login_count,rn,
		(CASE 
			WHEN LAG(login_count,1) OVER (ORDER BY rn)= login_count 
				AND login_count =1 AND LAG(userid,1) OVER (ORDER BY rn)=userid 
			THEN 0 
			ELSE 1 
		END) AS wether_e	
	FROM whether_login_rn
),
numbered_sequence AS (
	SELECT userid,login_date,login_count,rn, wether_e,
		SUM(wether_e) OVER (ORDER BY rn) AS lab
	FROM whether_eq),
consecutive_logins_num AS (
	SELECT userid,login_date,login_count,rn, wether_e,lab,
  		(SELECT (CASE WHEN max(COUNT(*))<3 THEN 0 ELSE 1 END)
   		FROM numbered_sequence b
   		WHERE b.rn BETWEEN a.rn - 6 AND a.rn
		AND b.userid=a.userid
   		GROUP BY b. lab) AS cnt
	FROM numbered_sequence a)
SELECT login_date,SUM(cnt) AS cont3_num
FROM consecutive_logins_num
WHERE login_date>=(SELECT MIN(login_date) FROM all_dates)+6
GROUP BY login_date
ORDER BY login_date;

esProc Desktop!

Plug & Play, Non-professionals can play it independently

Structured data oriented

Structured Process Language

Conside code, far better than VBA, SQL, and Python

	A
1	=file("login_data.csv").import@tc()
2	=periods(date(A1.ts),date(A1.m(-1).ts))
3	=A1.group(userid).(~.align(A2,date(ts)).(if(#<7,null,(cnt=~[-6:0].group@i(!~).max(count(~)),if(cnt>=3,1,0)))))
4	=msum(A3).~.new(A2(#):dt,int(~):cont3_num).to(7,)

Native support for structured data
True set records, only one type of set
Compute desktop files directly
Complete programming capability
Simplify complex calculations and lower the technical threshold of data analysis

Excellent Table Processing

Merge, aggregate and split

Merge table with same columns
Merge and delete duplicated columns
Merge and aggregate duplicated columns
Append and accumulate data to summary
Split by categories or rows
……

Set operations and belongingness judgment

The intersection, union, and difference of simple members
The intersection, union, and difference on uncertain number of sets
Judge equality and belongingness between sets
……

Special grouping and aggregation

Group every N members
Group by adjacent data
Group by empty or non-empty rows
Group by data interval
Concatenate same group of data into text
……

Expand and complement

Generate continuous intervals
Convert one row to multiple rows
Split text into multiple rows
Fill in missing parts in continuous values
Fill up a number of blank row every N rows
……

Searching and filtering

Find position of a certain value
Filter by location
Find the first and last data
Find the maximum and minimum Values
Select the top N and last N
……

Duplicates judgment, counting and deleting

Find duplicate values
Count the number of duplicates
Remove paired data
Deduplicate row-based data
Deduplicate without changing original order
……

Association and comparison

Association through a table
Interval association
Use tow-dimensional association table
Use interval range to look up association table
Associate multiple rows
……

Operations on text and datetime

Split a text into multiple
Extract number and date
Split into words
Calculate overlapped time interval
Generate consecutive Sundays
……

Calculate cell value and aggregation value

Calculate by adjacent rows and intervals
Cumulate with possible early termination
Calculate cumulative value in categories
Add aggregation of same category
Split summary into detail rows
……

Sorting and Ranking

Align data in specified order
Align data with duplicates in specified order
Concatenate members of same ranks
Sorting in category
Ranking in category
……

Conversion between rows and columns

Rows to fixed columns
Convert row-style table and cross table
Convert row-column upper classification
Put data in category into columns
Re-categorize data when put into columns
……

File extraction and generation

Extract data from uncertain rows
Extract data from specified cells
Extract data from mixed-style structure sheet
Fill in specified cells
Generate card-style sheets
……

SPL IDE

Full-featured data computing IDE

Coding in grid

SPL XLL

Plug-in to enhance Excel computing capability

SPL WIN

Desktop data interactive computing tool

Case

Find out the star products that are in the top10 every month

SPL XLL: Calculate in Excel directly

Transpose the table from column to row

SPL XLL: Process in Excel directly

Monthly sales of each model, then LRR and YOY

SPL WIN：Multi-step interactive analysis

1. Group by model, month, year and count

2. Calculate YOY: if(Model==Model[-1] && Month==Month[-1],SalesVol/SalesVol[-1],null)

3. Resort by model, year, month

4. Calculate LRR: if(Model==Model[-1] ,SalesVol/SalesVol[-1],null)

Associate spreadsheets and calculate actual wages

Wage = StandardWages*(1+Evaluation-Absence/40)+Bonus

SPL WIN：Multi-step interactive calculation

1. Load 3 spreadsheets

2. Associate standard with absent and performance

3. Calculate wage according to the formula

Complement missing data in Excel: use mode for integer columns and average for floating-point columns

SPL IDE： Code implements loop and judgment logic

	A	B	C
1	=file("data.xlsx").xlsimport@t()
2	for A1.fname()	=A1.field(A2)
3		=B2.sum()	=B2.sum(int(~))
4		=if(B3==C3,B2.mode(),B2.avg())
5		=B2.(if(~,~,B4))	>A1.field(A2,B5)
6	>file("dataNew.xlsx").xlsexport@t(A1)

Extract row-data from a batch of card-type XLS files

SPL IDE：Concise code parses XLS files with complex format

	A	B	C
1	[ID,Name,Sex,Postion,Birthday,Phone,Address,PostCode]
2	[C1,C2,F2,C3,C4,D5,C7,C8]
3	=directory@p("data/*.xlsx")
4	for A3	=file(A4).xlsopen()	=B2.(B4.xlscell(~))
5		=@\|C4
6	=create(${A1.concat@c()}).record(B5)
7	>file("all.xlsx").xlsexport@t(A6)

Resource

Download

SPL Programming

Desktop and Excel Data Processing Cases

Learning SPL Win

A must-have tool for desktop data processing and analysis esProc Desktop