Hello everyone!

I am working on my current java-based project where i have to do assumption is two series cointegrated or not. And i faced with problem that there are not enough duscussions and community-envolving about nm-dev library.

If i used Engle-Granger cointegration test, then i just calculated P-value, T-value, C-value. So i can apply only one condition to bu sure series are cointegrated if **(P-value < 0.5 && T-value < C-value)**.

It is very easy to do with python library "statsmodels" just importing "coint" function that returns all three variables above.

So i need your helping with better understanding how Johanson test works and how can i apply it to my series to do exactly conclusion about cointegration.

So, i have initialized an instance of the class CointegrationMLE

CointegrationMLE cointegrationMLE = new CointegrationMLE(timeSeries, false);

And i have numerous parameters, such as

n: 198 rank: 2 eigen values: [0,070806, 0,004953] alpha: 2x2 [,1] [,2] [1,] -0,087240, 0,001106, [2,] 0,175470, 0,002975, 1-th cointegrating factor: [1,000000, -0,383241] 2-th cointegrating factor: [1,000000, -0,051580

Are these parameters sufficient to make an assumption about series cointegration? And what is the final rule for it? Do I need to additionally use JohansenTest?

I will additionally say that i have only 2 time series, which i want to test. I think I got to the solution, but I'm not completely sure. That video helps me a lot.

//preparing cointegratoinMLE instance for forward testing CointegrationMLE cointegrationMLE = new CointegrationMLE(timeSeries, false); JohansenTest johansenTest = new JohansenTest( JohansenAsymptoticDistribution.Test.EIGEN, // either TRACE, does not matter JohansenAsymptoticDistribution.TrendType.CONSTANT, //not sure which option exactly i should to use 2); //dim - the number of (real) eigenvalues. For 2-timeSeries this value is 2 int r = johansenTest.r(cointegrationMLE, 0.05); // where 0.05 - is my 5% treshhold if (r == 1) //then null hypothesis rejected and r==1 is only satisfied value for my cointegrated 2 time series

Hello,

The best reference on the subject is this authoritative book:

https://oxford.universitypressscholarship.com/view/10.1093/0198774508.001.0001/acprof-9780198774501

Or, the time series chapter in this book. It has detailed instructions on how to use the SuanShu and NM Dev library.

https://link.springer.com/chapter/10.1007/978-1-4842-6797-4_15

**Maximum Eigenvalue Test**

For the Johansen cointegration test, there is a hypothesis test to determine r, the number of cointegrating relationships.

We test for different r in an iterative manner until we determine the maximum r.

Suppose the rank of Π** **is r. Then we can decompose Π** **into:

Π=*α*β'

β can estimated by maximizing the very complicated and long log-likelihood function in (Johansen, 1995). In NM Dev, the class JohansenTest implements Johansen’s algorithm and tests to find β. It computes both the trace and eigen statistics. It supports a number of trend assumptions.

- NO_CONSTANT :d This is trend type I: no constant, no linear trend.
- RESTRICTED_CONSTANT: This is trend type II: no restricted constant, no linear trend.
- CONSTANT: This is trend type III: constant, no linear trend.
- CONSTANT_RESTRICTED_TIME: This is trend type IV: constant, restricted linear trend.
- CONSTANT_TIME: This is trend type V: constant, linear trend.

See the code example here for more information.

https://github.com/nmltd/numerical-methods-java/blob/main/src/main/java/dev/nm/nmj/Chapter15.java

public void cointegration() throws Exception {...}

It computes both the trace and eigen statistics. It supports a number of trend assumptions.

- NO_CONSTANT :d This is trend type I: no constant, no linear trend.
- RESTRICTED_CONSTANT: This is trend type II: no restricted constant, no linear trend.
- CONSTANT: This is trend type III: constant, no linear trend.
- CONSTANT_RESTRICTED_TIME: This is trend type IV: constant, restricted linear trend.
- CONSTANT_TIME: This is trend type V: constant, linear trend.

Thanks a lot!

Since testing for a cointegration-check between huge amount of series is very resource intensive for each type of trend, which type is most suitable for a series representing the market price of an asset?

Can you tell me more about the domain of application that you are working on? Are they financial time series?

If there are many time series, then cointegration may not even be the right tool for your application. Stationarity is a very stringent criterion that you may or may not need. It all depends on what you are trying to do.

Can you tell me more about the domain of application that you are working on? Are they financial time series?

If there are many time series, then cointegration may not even be the right tool for your application. Stationarity is a very stringent criterion that you may or may not need. It all depends on what you are trying to do.

Yes. It is financial time series. I am working on a bot that trade on statistical arbitrage strategy. So, first of all i get all 200-hours close price for all ~175 tradable tickers. Steb by step i check cointegration between each 2 pairs. It is over ~30k pairs.

I see. So, you are trying to find pairs of integrated time series.

One simple solution is to use multi-threading, multi-processes, and multi-machines. Hadoop and spark can distribute Java tasks.

It also depends on which version you are using. Are you using SuanShu 4 or NM Dev 2. NM Dev is much faster for its most updated code.

I see. So, you are trying to find pairs of integrated time series.

One simple solution is to use multi-threading, multi-processes, and multi-machines. Hadoop and spark can distribute Java tasks.

It also depends on which version you are using. Are you using SuanShu 4 or NM Dev 2. NM Dev is much faster for its most updated code.

I absolutely agree with you about multi threading. I experimented with single thread app before. And checking all 30k pairs (with all 5 JohansenAsymptoticDistribution.TrendType) spend huge amount of time. So now i start using Akka for multithreading. It is X-time faster. But nonetheless all 5 TrendType for my opinion is too much. That is why i asked my initial question about which trendType is more suitable for financial series. What is your opinion?

By the way python library statsmodels.tsa.stattools check 2 pairs for cointegration pretty faster. But i need java speceific tool.

Now, i just experiment with my pet project and suanshu library. If i will have valuable progress with my activity I will definitely look into it NM Dev library.

I would use TrendType.CONSTANT for a pair of financial time series.

Just out of curiosity, why do you need a Java specific tool over Python please?

I prefer using python only for lightweight scripting jobs. Also i am not a deep expert in python.

I am using java during many years and i would like to write a Java productivity application.