semi-endless loop around add_candle call #447

movy · 2024-05-19T06:30:11Z

Describe the bug
Until this week I'd been running 0.44 w/o any problems, but 0.48 added some welcome changes, so I switched my backtesting to 0.48 and encountered a very long loop around https://github.com/jesse-ai/jesse/blob/master/jesse/store/state_candles.py#L102 call, which sometimes causes a backtest to hang for 10-15 minutes. This happens sporadically, maybe one out of 100-200 backtests run, but it effectively it blocks the whole pipeline until problematic test is finished with this loop.

This happens on all symbols and tfs.

When I say 'around' it means impossible to track down exact line that hangs, but what I could gather via py-spy is:

  %Own   %Total  OwnTime  TotalTime  Function (filename)
 48.00% 100.00%   627.62s    652.15s   add_candle (jesse/store/state_candles.py)
 52.00%  52.00%   624.50s    624.50s   __getitem__ (jesse/libs/dynamic_numpy_array/__init__.py)
  0.00%   0.00%   0.310s    0.320s   _var (numpy/core/_methods.py)
  0.00%   0.00%   0.070s    0.180s   mean (numpy/core/fromnumeric.py)
  0.00%   0.00%   0.070s    0.110s   _mean (numpy/core/_methods.py)
  0.00%   0.00%   0.070s    0.390s   _std (numpy/core/_methods.py)
  0.00%   0.00%   0.060s    0.060s   timeframe_to_one_minutes (jesse/helpers.py)
  0.00%   0.00%   0.050s    0.050s   _count_reduce_items (numpy/core/_methods.py)
  0.00% 100.00%   0.040s    53.00s   _step_simulator (jesse/modes/backtest_mode.py)
  0.00%   0.00%   0.030s    0.630s   peaks (jesse-bot/custom_indicators/peaks.py)
  0.00%  52.00%   0.020s    25.74s   _simulate_price_change_effect (jesse/modes/backtest_mode.py)
  0.00%   0.00%   0.020s    0.020s   stochrsi (tulipy/__init__.py)
  0.00%   0.00%   0.020s    0.020s   is_collecting_data (jesse/helpers.py)
  0.00%   0.00%   0.020s    0.020s   wrapper (talib/__init__.py)
  0.00%   0.00%   0.010s    0.190s   mean (<__array_function__ internals>)
  0.00%   0.00%   0.010s    0.010s   _sum (numpy/core/_methods.py)
  0.00%   0.00%   0.010s    0.010s   is_live (jesse/helpers.py)
  0.00%   0.00%   0.010s    0.400s   std (numpy/core/fromnumeric.py)
  0.00%   0.00%   0.010s    0.010s   get_candles (jesse/store/state_candles.py)
  0.00%   0.00%   0.010s    0.010s   get_candle_source (jesse/helpers.py)
  0.00%   0.00%   0.010s    0.410s   std (<__array_function__ internals>)
  0.00%   0.00%   0.010s    0.010s   is_debuggable (jesse/helpers.py)
  0.00%   0.00%   0.010s    0.010s   _get_fixed_jumped_candle (jesse/modes/backtest_mode.py)
  0.00%   0.00%   0.010s    0.700s   _check (jesse/strategies/Strategy.py)
  0.00% 100.00%   0.000s    53.00s   backtest_rungs (jesse-bot/backtest_new.py)
  0.00% 100.00%   0.000s    53.00s   _resume_span (ray/util/tracing/tracing_helper.py)
  0.00%   0.00%   0.000s    0.690s   decorated (jesse/services/cache.py)

I've added some logging:

    def get_storage(self, exchange: str, symbol: str, timeframe: str) -> DynamicNumpyArray:
        key = jh.key(exchange, symbol, timeframe)
        print(key) <----
[...]


    def add_candle(
        [...]
        print('getting arr') <---
        arr: DynamicNumpyArray = self.get_storage(exchange, symbol, timeframe)
        print('got arr', len(arr)) <---

And from these prints we can see this method is called millions of times by a single backtest process:

Eventually number of calls falls:

Which makes me think there's loop inside loop somewhere.

I could not instantly pinpoint recent changes responsible for this. Will roll back to 0.44 for now and slowly work my way to 0.48, maybe I can find where regression first appeared.

Any help from those familiar with recent changes is appreciated @yakir4123

The text was updated successfully, but these errors were encountered:

yakir4123 · 2024-05-19T06:37:39Z

@movy
yesterday i pushed a bug fix, it may be the problem but i cabt be sure.
#446

please pull the new master version and try it out.
let me know if it fixes it. if not we need to investigate it 👍
ty for the issue

can you also tell me. what tf your example was? the number of routes and the simulation start / end time?

movy · 2024-05-19T07:00:08Z

Thanks for prompt response,
I use one route + one extra_route, and not using new fast_mode for now, as it fails for different reasons with my strategies (will investigate this later, as it's just barely faster for strategies with complex computation-heavy indicators).

I import candles for ~6 months, then split them into 40 days periods and get candles for each period. Basic code:

for index, date_range in enumerate(backtest_dates, start=0):
    print(f"Importing candles for {date_range[0]} - {date_range[1]}...")
    try:
        candles_warmup, candles_period = research.get_candles(exchange, symbol, tf, int((datetime.strptime(date_range[0], '%Y-%m-%d')).timestamp() * 1000), int((datetime.strptime(date_range[1], '%Y-%m-%d')).timestamp() * 1000), warmup_candles_num=config["warm_up_candles"], caching=True, is_for_jesse=True)
        period_candles = {
            jh.key(exchange, symbol): {
                'exchange': exchange,
                'symbol': symbol,
                # backtest candles tf is always 1m
                'candles': candles_period
            }
        }
        warmup_candles = {
            jh.key(exchange, symbol): {
                'exchange': exchange,
                'symbol': symbol,
                'candles': candles_warmup
            }
        }
        candles_refs.append(ray.put(period_candles))
        # print(candles_refs)
        warmup_candles_refs.append(ray.put(warmup_candles))
[...]

        for candles_ref in candles_refs:
            result = backtest(config=ray.get(config_ref),  
                            routes=ray.get(routes_refs[params["tf"]]),  
                            extra_routes=ray.get(extra_routes_refs[anchor_timeframe(params["tf"])]), 
                            candles=ray.get(candles_ref),  # candles within backtest date range
                            warmup_candles=ray.get(warmup_candles_refs[candles_refs.index(candles_ref)]), 
                            generate_csv=False,
                            generate_json=True,
                            hyperparameters=params,
                            fast_mode=False)

Will try master branch now.

yakir4123 · 2024-05-19T07:17:46Z

the higher the timeframe you use the better theperformance of fast_mose.
If you use 5m or 15m the changes will be noticable (as long as the indicators calculation is not big enough) but not as 1h or 4h

saleh-mir · 2024-05-19T07:25:18Z

Hey Movy,

I pushed the patch that Yakir is talking about. It is version 0.48.1. Please give it a try and see if it fixes your issue.

movy · 2024-05-19T17:32:14Z

So, good news and bad news: this regression was not caused by 0.48 (i.e. 0.48.1. did not fix it). The last version that does not hang is 0.45, so I'm sticking to it for now.

I guess some change related to warmup_candles passed as a parameter to backtest() call that was introduced in 0.46 causes this behaviour. It is also worth looking into why add_candle() is called millions of times with ever increasing arr length: even if this is intended behaviour, this is highly unoptimal (especially in python) in my view and should be replaced with a single call.

p.s. inconsistent params names 'warm_up_candles' vs 'warmup_candles' gave me some avoidable pain while I was refactoring backtest() calls. Can be improved as well, I think warm_up_* was used from the beginning, maybe we can stick with it?

yakir4123 · 2024-05-20T11:34:54Z

@movy
do you maybe cancel a lot of orders in your strategy?

movy · 2024-05-20T17:12:19Z

Yes, I cancel on every candle if order was not executed. This never was a problem prior to 0.46.

yakir4123 · 2024-05-20T18:15:08Z

@movy
Cool I didnt know that v0.46 didnt has this problem

please try this PR
I did some fixes on these cases exactly!

#450

movy · 2024-05-22T04:14:43Z

Thank you, @yakir4123 , I tried your branch, unfort same result. I will try to create a small script that consistently reproduces problematic behaviour, will post it here later.

yakir4123 · 2024-05-22T06:01:05Z

@movy , Yeah im facing the same problemim trying to solve it. I know the root of the problem. I guess i need to go back to v0.46 to see what was there

movy · 2024-05-22T06:25:38Z

0.45 is the final version working w/o hanging, 0.46 introduced this bug

yakir4123 · 2024-05-24T06:05:37Z

@movy
do you mind try this branch with fixes on these cases that i believe you have too.
#455

https://github.com/yakir4123/jesse/tree/yakir/feat/delete-orders

movy · 2024-05-25T12:29:19Z

@yakir4123 , still hangs unfort.
But I think you're moving in the right direction with identifying orders that are updated on each candle. My strategy logic is such that I update take_profit on each def update_position(self) and cancel non-executed orders on each candle via def should_cancel_entry(self):. I really will try to find time next week and post a simple repro.

yakir4123 · 2024-05-25T13:04:36Z

@movy
I updated the branch 1 hour ago, if you fetch the last update before that please give another try.
I do something very similar to you and now it workings for me really fast

movy · 2024-05-25T13:27:10Z

Last commit I tested was 5ed48de
·
2 hours ago

yakir4123 · 2024-05-25T13:30:40Z

:/ I wonder what is it, please give me a simple strategy that repreduce that ill probably find the problem

movy · 2024-05-26T07:20:38Z

DUAL_THRUST.py.zip

Here's one of Jesse's default strategies. I added trailing take-profit and offset-entries (i.e. entry not at current price, but at price - offset % not to enter at the candle start). These changes do not necessarily make this particular strategy more profitable, but they offer an example that hangs backtests after 70-80 iterations with Jesse after v0.45, including @yakir4123 fixes.

Tested on multiple coins and 4h candles for speed, same result. I run tests using Ray.tune (to enable more complex hyperparams perturbations and schedulers), but this particular code should run with jesse out of the box.

movy added the bug Something isn't working label May 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

semi-endless loop around add_candle call #447

semi-endless loop around add_candle call #447

movy commented May 19, 2024 •

edited

yakir4123 commented May 19, 2024 •

edited

movy commented May 19, 2024 •

edited

yakir4123 commented May 19, 2024

saleh-mir commented May 19, 2024

movy commented May 19, 2024

yakir4123 commented May 20, 2024

movy commented May 20, 2024 •

edited

yakir4123 commented May 20, 2024

movy commented May 22, 2024

yakir4123 commented May 22, 2024

movy commented May 22, 2024

yakir4123 commented May 24, 2024 •

edited

movy commented May 25, 2024

yakir4123 commented May 25, 2024

movy commented May 25, 2024

yakir4123 commented May 25, 2024

movy commented May 26, 2024 •

edited

semi-endless loop around add_candle call #447

semi-endless loop around add_candle call #447

Comments

movy commented May 19, 2024 • edited

yakir4123 commented May 19, 2024 • edited

movy commented May 19, 2024 • edited

yakir4123 commented May 19, 2024

saleh-mir commented May 19, 2024

movy commented May 19, 2024

yakir4123 commented May 20, 2024

movy commented May 20, 2024 • edited

yakir4123 commented May 20, 2024

movy commented May 22, 2024

yakir4123 commented May 22, 2024

movy commented May 22, 2024

yakir4123 commented May 24, 2024 • edited

movy commented May 25, 2024

yakir4123 commented May 25, 2024

movy commented May 25, 2024

yakir4123 commented May 25, 2024

movy commented May 26, 2024 • edited

movy commented May 19, 2024 •

edited

yakir4123 commented May 19, 2024 •

edited

movy commented May 19, 2024 •

edited

movy commented May 20, 2024 •

edited

yakir4123 commented May 24, 2024 •

edited

movy commented May 26, 2024 •

edited