Resampling data series within group

Greetings!

There is a function in Pandas called resample.
Use case for this: converting rows with start_date and end_date to daily data.

Don’t know how common is this request, but this is used in a lot of our ETL stuff.
I would REALLY appreciate if you had it. Right now it’s a blocker for moving completely to Parabola.

Hey Dima!

I think there are a few different ways to resample with Parabola steps - it depends on your starting data and how you want it to look at the end. Could you post a screenshot of the starting data (or an example that shows what it generally looks like) and what you need it to look like after the sampling? I’d be happy then to show you how to get there!

Hello Brian, thanks for the quick reply. Here is the screenshot of starting data and expected result data:


What would you advise?

Oh I see - so given a row with a start and an end date, you would like to create rows to fill in all of the timeseries dates for that range, and then move on to the next row from the original data set.

Unfortunately Parabola is not great at filling in timeseries like this, and the only ways that I have done it in the past is with pretty brittle workarounds.

It looks like you may have been able to accomplish this with an API in your screenshots, is that correct?

Hello @brian!

Yes, the workaround I have is to create AWS lambda function to do resampling. Works good so far.

But would appreciate if you have that in feature requests tho :slight_smile:

1 Like