S2 IDE

an integrated development environment of advanced analytics and data

Tutorials

Here is a list of tutorials.

  1. A quick start guide
  2. S2 Tutorials
  3. Sample code

More coming…

R Alternative

You can do a lot of things in S2 that you can do in R and S2 does so much faster. Moreover, R code runs inside only the R environment. It is very difficult to deploy them anywhere else such as in embedded devices like microwaves, automobiles, space rockets. S2 code runs in any JVM environment. There are now 15 billion devices that run JVM!

S2 is an IDE for coding numerical algorithms. Let’s start with 1+1.

S2 1 plus 1

 

With 1+1 working, we can do pretty much anything in numerical programming, which is just a more complicated series of many 1+1. For example, integration.

S2 integration

S2 has the fastest linear algebra package in the Java world, probably.

S2 multiplication

S2 supports a few dozens type of graphs, charts and plots.

Box plotS2 box plot

Density plotS2 density plot

Scatter plot
S2 scatter plot

Histogram
S2 histogram

Bar chart
S2 bar chart

Surface plot

S2 surface plot

and many more…

S2 has a very comprehensive statistics package.

S2 supports almost all types of linear regressions and their statistics, OLS, GLM and logistics.

S2 ordinary least square regression

Time series analysis

S2 time series

Random number generation

S2 random number generation

Distributions

S2 F distribution

and many more…

Solvers are the foundation of the future mathematics. They are the core of AI and Big Data Analytics. We need solvers for any problems that do not have a closed form solution. That is  pretty much any modern problem nowadays. S2 supports a full suite of all known standard optimization algorithms.

Linear programming

S2 linear programming

Quadratic programming

S2 quadratic programming

Second order conic programming

S2 second order conic programming

and many more…

It is simple to create and train a Neural Network (NN) in S2.

A simple script trains an NN to learn the Black-Scholes formula from a data set of stock prices and option prices.

S2 neural network

It converges in a few hundred epochs.

S2 learn black scholes formula

Python Replacement

Python does two good things: (1) scripting as a glue to put together many components together to do data analysis, and (2) array/tensor programming. S2 does those as well but better.

The problems with Python are: (1) scripting or interpreted language is slow, and (2) it is very difficult to deploy code to other devices due to numerous versioning of dependencies and an assorted array of libraries in FORTRAN, C, C++, etc. S2 is fast and runs on the 15 billion devices with a JVM by copy-and-pasting jars.

First, Python is slow, very slow. Second, Python scripts runs only in the Python environment. You cannot port it to your phone, watch, router, automobile, rockets. Worst of all, Python deployment is well known to be a nightmare. It runs fine on your machine, but it takes tremendous effort to make it run on another person’s machine.

S2 scripting, compiled to Java bytecode, is orders of magnitude faster than Python’s. It runs on any (embedded) device that runs JVM, hence no deployment problem. Java vs Python

S2 scripting acts as the glue to put many components together but with much better performance. The following case show a scheduling system we built for a steel manufacturing plant. The steps are:

  1. Read the job data
  2. Read the machine data
  3. Schedule the jobs to the machines
  4. Plot the job-shop schedules to maximize utilization

All these steps are done in an S2 script in 12 lines! This same code can be deployed on S2, on a stand-alone application or on a cloud using REST.

S2 njsteel

The output schedule in Gnatt chart.

S2 njsteel gnatt chart

The power of Python comes from these 3 libraries: scipy, numpy and pandas. Together, they allow users to put data in a high-dimensional array (aka tensor) so that they can dice, slice, cut, sample the tensor in however way they want. More importantly, they magically “convert” any Python script into parallel execution code for high performance. It splits a pandas DataFrame to several chunks, spawns a thread to operate on each chunk and combines them back together. (Yet, most Python programmers don’t know how or just don’t do this. This is one reason why a lot of Python scripts are slow.)

S2 supports exactly this kind of array/tensor programming for parallelization. Here is a paralleled version of the Black-Scholes formula application example. The formula is applied concurrently to all the rows in the stock share price/option price table/array, in the same fashion that Python does with pandas.

  S2 asynchronous programming

S2 supports also all kinds of dissecting, slicing, cutting, dicing, sampling, massaging data frame using ND4J.

Big Data Handling

S2 can handle terabytes or even petabytes of data across arrays of machines in an effective manner using map-reduce programming in a simple S2 script.

Demo: a word count example of very large documents across machines.

Industrial Partners

S2 makes partnership with many third-party vendors to make available on S2 their analytics, algorithms and data, hence a one-stop shop of algorithms and data.

AlgoQuant is a large library of financial analytics. It has hundreds of functions. It also comes with well cleaned and professionally maintained data for equities (US and China). AlgoQuant has many templates and frameworks for users to do research in portfolio management.

For example, suppose a user want to study how a simple moving average crossover works for a particular stock, s/he needs only to write the strategy code in a few lines.

AlgoQuant simple moving average crossover

The script can be plugged into the AlgoQuant framework for backtesting.

AlgoQuant backtesting

AlgoQuant has a suite of analysis and reporting tools.

Algoquant Reporting

SuperCurve is a fixed income data firm in China. They sell high quality bond data and analytics.

A user can retrieve China bond data in S2 using the SuperCurve API. Here is how s/he can fit a zero-coupon yield curve using those bond data in S2 using only two lines of code.

SuperCurve zero-coupon curve

With a yield curve, s/he can price any fixed income instrument on that date in S2.

SuperCurve bond risks

Licensing

Community Edition

Community Edition

S2 Community Edition is free to use. Please let us know what you think of S2, bugs and feature requests in our forum. You can try it out without registering an account but your work won't be saved. Registration is free!

Collaboration

Enterprise Edition

If you are looking to

  1. increase S2's computational power
  2. have a private and secured server (cluster)
  3. co-develop and customize S2

Please contact sales.

Third Party Vendors

Third Party Vendors

If you would like to host your data and/or algorithms/analytics on S2, please contact us.

Collaboration

Collaboration

If you would like to work together or contribute to S2, please contact us.